We show that the playing sequence–the order in which players update their actions–is a crucial determinant of whether the best-response dynamic converges to a Nash equilibrium. Specifically, we analyze the probability that the best-response dynamic converges to a pure Nash equilibrium in random n-player m-action games under three distinct playing sequences: clockwork sequences (players take turns according to a fixed cyclic order), random sequences, and simultaneous updating by all players. We analytically characterize the convergence properties of the clockwork sequence best-response dynamic. Our key asymptotic result is that this dynamic almost never converges to a pure Nash equilibrium when n and m are large. By contrast, the random sequence best-response dynamic converges almost always to a pure Nash equilibrium when one exists and n and m are large. The clockwork best-response dynamic deserves particular attention: we show through simulation that, compared to random or simultaneous updating, its convergence properties are closest to those exhibited by three popular learning rules that have been calibrated to human game-playing in experiments (reinforcement learning, fictitious play, and replicator dynamics).
Heinrich, T., Jang, Y., Mungo, L., Pangallo, M., Scott, A., Tarbush, B. & Wiese, S. (2021). 'Best-Response Dynamics, Playing Sequences, And Convergence To Equilibrium In Random Games'. INET Oxford Working Paper No. 2021-02.