Evolutionary dynamics with game transitions

Significance Evolving populations are constantly subjected to changing environmental conditions. The environment can determine how the expression of traits affects the individuals possessing them. Just as important, however, is the fact that the expression of traits can also alter the environment. We model this phenomenon by introducing game transitions into classical models of evolutionary dynamics. Interacting individuals receive payoffs from the games that they play, and these games can change based on past actions. We find that game transitions can significantly reduce the critical benefit-to-cost threshold for cooperation to evolve in social dilemmas. This result improves our understanding of when cooperators can thrive in nature, even when classical results predict a high critical threshold.


Supporting Information Text
The supplementary information is structured as follows.
In Section 1, we study evolutionary dynamics with local game transitions. We derive an analytical condition for one strategy to be favored over the other. A further analysis gives the mathematical formula for the critical benefit-to-cost ratio for cooperation to evolve.
In Section 2, we study how the initial condition (initial fractions of various games played in the population) affects the evolutionary outcomes. We provide an effective approach to evaluate whether or not the evolutionary dynamics is sensitive to the initial condition.
In Section 3, we study evolutionary dynamics with global game transitions. We derive an analytical condition of one strategy to be favored over the other, as well as the critical benefit-to-cost ratio for cooperation to evolve. We prove that our rules also hold when players use stochastic strategies (cooperate or defect with a probability rather than unconditionally).
In Section 4, we study four representative examples, including state-independent game transitions (the game to be played is independent of games played in the past), strategy-independent game transitions (the game to be played is independent of players' actions in the past), game transitions between two states (including the example presented in the main text), and probabilistic game transitions between three states (transitions between different games with a probability). We show that how probabilistic game transitions affect the favorable effects of game transitions on cooperation may depend on the variations in different games.

Evolutionary dynamics with local game transitions
We consider game transitions among n states, described by games 1, 2, . . . , n. The payoff structure of game i is [1] where each value corresponds to a payoff derived by a player with a strategy in the column against a player with a strategy in the row. The game transition pattern is described by three matrices, i.e.
where p (s) ij represents the probability that players play game j in the next time step conditioned on that they play game i in the current time step and there are s A-players, where i, j ∈ {1, 2, . . . , n} and s ∈ {0, 1, 2}.
On graphs or social networks, each player occupies a node. If two players (or nodes occupied by players) are connected by an edge or a social tie, they play a one-shot game in each time step. The main idea of the theoretical analysis is to couple the game played by two connected players and their strategy profiles into edges. Let E (i) XY denote an edge in which the two connected players take strategies X and Y (X, Y ∈ {A, B}), respectively, in game i ∈ {1, 2, . . . , n}. For example, in edge E (1) AA , both of the players take A strategy and they play game 1. We then introduce the following variables to describe this evolving system: XY given that one node of this edge is occupied by a Y −player; p XY : the frequency of edges that connect an X-player and a Y -player; q X|Y : the conditional probability to find an X−player given that the adjacent node is occupied by a Y −player.
Then we have the identities q A|X + q B|X = 1; [3e] Note that players' strategies and the game they play coevolve throughout the evolutionary process. From the perspective of network dynamics, we need to consider the change in the frequency of nodes occupied by A-players and the frequency of edge E (i) XY . Based on above identities, we can use p A and q (i) X|Y to describe the whole system. In the following, we study a random regular graph, where each node is linked to other k nodes.
A. Interactions. In each time step, each player interacts separately with every neighbor, and the games played in different interactions can be distinct. Each player derives an accumulated payoff, π, from all interactions, and this payoff is translated into reproductive fitness, f = 1 − δ + δπ, where δ 0 represents the intensity of selection (1). δ scales the contribution of games played to one's fitness/reproductive rates. The assumption of δ 1, termed weak selection, describes that the game plays only a very small role or it represents only one of many factors influencing the overall reproductive rate. Besides, this assumption allows to derive analytical results and has been widely used in evolutionary biology (1,2). We focus on the effects of weak selection (3,4).

B. Death-birth updating.
Under death-birth updating, in each time step, a random player is selected to die; all neighbors then compete to reproduce and send an offspring to the vacant site (with the probability proportional to fitness) (5). We can also interpret this updat rule in a social setting: a random player i decides to update his or her strategy; subsequently, he or she adopts a neighbor's strategy with a probability proportional to the neighbor's fitness. Local transitions account for the fact that only the nearest neighbors compete for the vacancy. When the environment change is subject to human's willingness, these neighbors, compared with other players not involved in the competition, are more incentivized to modify the environment (games) in which they evolve. Therefore, under local game transitions, only games played by the nearest neighbors of the dead can update. We first investigate the change in the frequency of A−players.
A|B denote the number of neighbors who adopt strategy A and play game i with the focal (dead) player. Analogously, k (i) B|B denotes the number of neighbors who adopt strategy B and play game i with the focal player. Therefore, The probability for such a neighborhood configuration is The fitness of a neighbor who adopts strategy A and plays game i with the focal player is The fitness of a neighbor who adopts strategy B and plays game i with the focal player is The probability that one of neighboring A-players replaces the vacancy under such a neighborhood configuration is given by The probability that one of neighboring B−players replaces the vacancy under such a neighborhood configuration is given by Qi Su, Alex McAvoy, Long Wang, and Martin A. Nowak Therefore, p A increases by 1/N with probability . [9] B.2. Change in pA-updating a A-player. An A-player is chosen to die with probability p A . Let k A|A denote the number of neighbors who adopt strategy A and play game i with the focal player. Analogously, k (i) B|A denotes the number of neighbors who adopt strategy B and play game i with the focal player. Therefore, The probability for such a neighborhood configuration is [10] The fitness of a neighbor who adopts strategy A and plays game i with the focal player is The fitness of a neighbor who adopts strategy B and plays game i with the focal player is The probability that one of neighboring A-players replaces the vacancy under such a neighborhood configuration is given by . [13] The probability that one of neighboring B−players replaces the vacancy under such a neighborhood configuration is given by [14] Therefore, p A decreases by 1/N with probability [15] B.3. Change in pA. Let us now suppose that one strategy replacement event takes place in one unit of time. The time derivative of p A is given byṗ where AA . We proceed with the change in the frequency of each type of edge. Note that when a random player l is chosen to die, the edges between (i) l and its nearest neighbors and (ii) l's nearest neighbors and next-nearest neighbors have chance to update (see the description of local game transitions). We stress that the change in p (i) AA is different from that in p A . p A does not change when neighboring A-players replace the focal A-player (the dead A-player) or neighboring B-players replace the focal B-player (the dead B-player). However, in the same case p (i) AA likely changes since games in these edges may switch, which changes the edge type.
We first consider the case that a random B-player is chosen to die. We take the same neighborhood configuration as we do in Section B.1, i.e. k (j) AA results from two parts: the switching of edges connecting the focal B-player and its nearest neighbors, and the switching of edges connecting the nearest neighbors and the next-nearest neighbors. Under the given neighborhood configuration, the change in p (i) AA based on the former part is [18] Eq. 18 describes the edge switching of E AA , which occurs when (i) a neighboring A-player reproduces and its offspring replaces the dead B-player, i.e., BA → AA; (ii) neighboring A-players who plays game j with the dead in the current time step then plays game i in the next time step, (j) → (i).
The change in p (i) AA due to edges between the nearest and the next-nearest neighbors is [19] Eq. 19 indicates that regardless of which neighbor replaces the focal B-player, the change in p (i) AA due to the edges between the nearest and next-nearest neighbors remains the same.
Next, we consider the case in which a random A-player is chosen to die. We take the same neighborhood configuration as we do in Section B.2, i.e., k AA due to edges between the focal A-player and its nearest neighbors is [20] and P   ∆p Eq. 20 (resp. Eq. 21) captures the case in which a neighboring A-player (resp. B-player) successfully occupies the vacant site. The change in p (i) AA due to edges between the nearest and next-nearest neighbors is A|A , k AA is given bẏ where δ ij = 1 if i = j and δ ij = 0 otherwise.

B.5. Change in p
AB . When a B-player is selected to die and its neighbourhood configuration is the same as that in Section B.1, the change in p (i) AB due to edges between the nearest and next-nearest neighbors is [24] and P   ∆p B|B |j = 1, . . . , n P (B → B) . [25] Eq. 24 (resp. Eq. 25) captures the case when a neighboring A-player (resp. B-player) successfully occupies the vacant site.
The change in p (i) AB due to edges between the nearest and next-nearest neighbors is P ∆p [26] When an A-player is selected to die and its neighbourhood configuration is the same as that in Section B.2, the change in p (i) AB due to edges between the nearest and next nearest neighbors is [27] and P   ∆p [28] Eq. 27 (resp. Eq. 28) captures the case when a neighboring A-player (resp. B-player) successfully occupies the vacant site. The change in p (i) AB due to edges between the nearest and next-nearest neighbors is P ∆p [29] Analogously, we haveṗ [30] B.6. Change in p (i) BB . When a B-player is selected to die and its neighbourhood configuration is the same as that in Section B.1, the change in p (i) AB due to edges between the nearest and next-nearest neighbors is [31] and P   ∆p [32] Eq. 31 (resp. Eq. 32) captures the case when a neighboring A-player (resp. B-player) successfully occupies the vacant site. The change in p (i) BB due to edges between the nearest and next-nearest neighbors is [33] When an A-player is selected to die and its neighbourhood configuration is the same as that in Section B.2, the change in p (i) BB due to edges between the nearest and next nearest neighbors is [34] The change in p (i) BB due to edges between the nearest and next nearest neighbors is [35] The derivative of p [36] B.7. Different time scales. From Eq. 23, we havė When the intensity of selection is weak (δ 1), q A|A reaches its equilibrium much faster than p A (see Eqs. 16,38). Thus, the dynamical system converges quickly onto the slow manifold withq A|A = 0, so we have From Eqs. 3a-3f and 39, we find that for all X, Y ∈ {A, B}, p XY and q X|Y are a function of p A .
We define a function A R (s) mapping a set of (n − 1) where α = (k − 2)(1 − p A ) and β = (k − 2)p A . Then we use P (s) in Eq. 2 to define two (n − 1) × (n − 1) matrices as followP where s ∈ {0, 1, 2}. Let b denote a column vector with 3(n−1) entries: the first n−1 entries are p AA ; the next n−1 entries are p AB ; the last n−1 entries are p BB . Let v denote a column vector p Combining Eq. 39 and p (n) XY , we can reduce the system of Eqs. 23,30,36 tȯ [42] For a linear system described by Eq. 42, its equilibrium points can be obtained by solving the equation If for 0 < p A < 1, all eigenvalues ofĀ are negative numbers or complex numbers with negative real parts, the system is asymptotically stable and has a single equilibrium point given by [43] Regardless of the initial state of p , the system ultimately approaches to the equilibrium point. In other words, the initial fractions of various games do not affect the evolutionary outcome. We state that none ofĀ's eigenvalues can be positive, since this leads to a few terms in v increasing above 1 or decreasing below 0 (6), which is unrealistic in the current system. ButĀ may have zero eigenvalues. In such cases, the system described by Eq. 42 has more than one equilibrium point. The initial state of determines the equilibrium point that the system approaches. That is, the initial fractions of various games influence the evolutionary outcome. In Section 2, we provide an approach to efficiently evaluate the dependence of the evolutionary outcome to the initial fractions of various games.

B.8. Diffusion approximation. For given game transition matrices and the initial fractions of various games, by solving
Eq. 42, we obtain p BB into Eqs. 17a-17d and combining with Eq. 39, we have [44a] for 1 i n − 1. We obtain I Rn , I Sn , I Tn and I Pn by separately replacing We consider a one-dimensional diffusion process of the random variable p A . Within a short time interval ∆t, we have [45b] The fixation probability φ A (x) of A-players with initial frequency p A (t = 0) = x, satisfies the following differential equation [see Eq. (5.2.186) in Ref (7) and detailed derivation]: The solution to Eq. 46 is [see Eq (5.2.189) in Ref (7)] [48] In Eq. 48, the third equality holds when δ is sufficiently small.
B.9. Fixation probability. In a population of B-players, when a fraction x of B-players mutates to A-players, the fixation probability of these A-players is [49] The fixation probability of a fraction x of B-players is Then the ratio of fixation probabilities is [51] For sufficiently small x, we have [52] Overall, for a sufficiently large population and x = 1/N , the condition of A-players being favored over B-players Eq. 53 holds for not only death-birth updating, but also for imitation (see Section 1C for details) and pairwisecomparison (see Section 1D for details) updating. Note that for different updating rules, I Ri , I Si , I Ti , I Pi differ.
We now turn to donation games. The payoff structure for game i is [54] Substituting payoff structures into Eq. 53 and using Eqs. 17a-17d, we obtain the condition for ρ A > ρ B under death-birth updating, given by where [57] Using where [59] Inserting Eqs. 44a and 44c into Eq. 59, we get the formula of ξ i for death-birth updating. Letting k = n i=2 ξ i ∆b 1i /c and b = b 1 , we obtain the rule b/c > k − k .
C. Imitation updating. In each time step, a random player i is selected to evaluate its strategy. This player retains its own strategy or imitates a neighbor's strategy with probability proportional to fitness. Analyzing the evolutionary process as we do under death-birth updating, we havė where We redefine the function A R (s) to be [62] Then the system under imitation updating can be reduced tȯ [63] All other variables such as α, β,P (s) ,P (s) , b, v follow those defined for death-birth updating.
For donation games described by Eq. 54, we have the condition for ρ A > ρ B , where [65] Solving Eq. 63 and inserting I Ri , I Ti in Eq. 65, we get the expression for ξ i .

D. Pairwise-comparison updating.
In each generation, a random player i is selected to evaluate its strategy. This player randomly selects a neighbor j and compares payoffs. Player i then adopts j's strategy with probability where π i and π j denote the payoffs of i and j, respectively. Otherwise, player i retains its strategy. Analogously, we haveṗ where We redefine the function A R (s) to be [69] Then the system under pairwise-comparison updating can be reduced tȯ [70] All other variables such as α, β,P (s) ,P (s) , b, v follow those defined for death-birth updating.
For donation games described by Eq. 54, we have the condition for ρ A > ρ B , where [72] By solving Eq. 70 and inserting I Ri , I Ti in Eq. 72, we get the expression for ξ i .

Approach to evaluate the sensitivity of evolutionary dynamics to the initial condition
Here, we consider the initial condition, which refers to the initial fractions of various games played in the population. By calculating the eigenvalues of matrix A P (s) − 2k 2 I /(kN ) in Eq. 42 and evaluating the sign of all eigenvalues, we can tell whether or not under a given game transition pattern the evolutionary outcome is sensitive to the initial condition under death-birth updating for local game transitions. Analogously, we can study the matrix A P (s) − 2k(k + 1)I /(kN ) in Eq. 63 under imitation updating and the matrix A P (s) − (8k − 4)I /(kN ) in Eq. 70 under pairwise-comparison updating.
In this section, we provide an alternative approach to determine the dependence of the evolutionary outcome on the initial conditions. Based on the game transition matrix P (2) , P (1) , and P (0) in Eq. 2, we define a Markov chain with a state space E = {1, 2, . . . , 3n}. The probability transition matrix for this Markov chain is given by The entry in the ith row and the jth column of M is the transition probability from state i to state j. If the defined random process has only one closed communicating class, the evolutionary outcome is independent of the initial condition, regardless of the update rule. However, if it has more than one such class, the evolutionary outcome is sensitive to the initial condition. For a random process defined by a state space E and a probability transition matrix M, we can examine its communicating class structure as follows: lettingM = The sign of each entry inM, takingM ij (the entry in the ith row and the jth column inM) for example, actually indicates the transition possibility (not probability) from state i to state j within 3n-step transitions (less than or equal to 3n steps).M ij > 0 means that the system can transition from state i to j in at most 3n steps. ForM ij = 0, the transition is unlikely to happen within 3n steps, which indicates that the system entering into state i can never transition to state j. If there exists some j (1 j 3n) such that all entries in the jth column ofM are positive, any state can transition to state j and thus state j lies in a closed communicating class. In such a situation, if there is another closed communicating class, any state lying in the second class is unlikely to transition to state j, which leads to a contradiction. Therefore, a column of positive entries suggests a single closed communicating class. Similarly, if there is only one closed communicating class, any state can transition to one state of the closed communicating class. Thus, there must exist a column of positive entries. Analogously, we can prove that the absence of a column of positive entries implies the existence of more than one closed communicating class. We provide examples with two states for a better understanding of this approach. The game transition matrices are We have  There exist entries of 0 in every column ofM. Thus, there is more than one closed communicating class, and the initial fractions of various games affect the evolutionary outcomes. As a consistency check, we calculate the eigenvalues of A P (s) − 2k 2 I /(kN ) in Eq. 42, which are given by λ 1 = 0, λ 2 = −(k − 2)/(N k), and λ 3 = −(2k − 2)/(N k). The eigenvalue λ 1 = 0 confirms the sensitivity of the evolutionary outcome to the initial conditions. Actually, Eq. 74 intuitively shows that the game remains fixed throughout the evolutionary process. According to prior studies about edge-dependent games, the evolution proceeds "as if" all interactions are governed by an "effective" game (8,9). This "effective" game corresponds to the averaged of games played in all interactions, which suggests that its payoff structure depends on the fractions of various games. Thus the evolutionary outcome is sensitive to the initial condition, in line with the analysis based on the above approach. In Section 4C, we illustrate how to calculate ξ i for such a system.
Then, we present an example with game transition matrices Actually, this case corresponds to the game transition pattern used in Fig. 1 [79] Except for entries in the fifth column, all other entries inM are positive. There is only one closed communicating class. The evolutionary outcome therefore is insensitive to the initial condition. As a consistency check, we calculate the eigenvalues of A P (s) − 2k 2 I /(kN ) in Eq. 42, which are given by λ 1 = −2k/N , λ 2 = −2k/N , and λ 3 = −2k/N . The system has a unique equilibrium point and the evolutionary outcome is independent of the initial condition. In Section 4C, we illustrate how to calculate ξ i for this system.
We now briefly explain why the closed communicating class of this random process can predict the sensitivity of the evolutionary outcome to the initial condition. The main idea is whether or not the initially-assigned game between two connected players constrains the game they play in the long-term evolutionary process. For example, the transition pattern we illustrate in Eq. 74 describes that if two individuals initially play game 1, regardless of their strategic actions, they will play game 1 throughout the process. Obviously, the initial game decides the game they play later. Therefore, the initial condition affects the evolutionary outcome.
We rename the states in E as E BB , where the i th entry corresponds to the original state i. In the following, we show that M actually captures the state transition of an edge throughout the process. As defined in Section 1, the state of an edge is given by E (i) XY , where X, Y ∈ {A, B} and i ∈ {1, 2, . . . , n}. X and Y are strategies of the two connected players and i is the game they play. The transition of an edge state can arise from two parts: the change in players' strategies and the change in the game they play. The illustrated below is the transition of an edge in one time step: Let AA denote both players taking A-strategies, AB denote one player taking A−strategy and the other taking B-strategy, and BB denote both players taking B−strategies. Note that in the current model, in each generation, only a player has the opportunity to update its strategy. Thus, the strategy transition follows (i) AA can remain in AA or transition to AB but can not transition to BB; (ii) AB can remain in AB or transition to AA or BB; (iii) BB can remain in BB or transition to AB but can not transition to AA. That is, for any i and l, E (i)

AA and E (l)
BB are unlikely to transition to each other, corresponding to the two null matrices in M (see 0 in Eq. 73). The transition of the game is governed by P (2) , P (1) , and P (0) . For example, P (2) determines whether or not an edge of E j ∈ {1, 2, . . . , n}), corresponding to those terms including P (2) in M. Note that the realistic evolutionary process is much more complicated and it is impossible to obtain the exact transition probability of an edge from one state to another, but the matrix M can describe the possibility that an edge of E When the random process has only closed communicating class, let c denote the set of all states lying in this class. Eqs. 16,23,30,36 show that for a sufficiently small δ, the fractions of various edges change much faster than the fractions of the two strategies. That is, although there is a frequent transition between A-players and B-players, namely an A-player transitioning to a B-player or a B-player transitioning to an A-player, the fraction of A-players varies at a relatively low rate. In the evolutionary process, an edge transitions among various states as the strategies adopted by the connected individuals and the game they play change. The state transition possibility of an edge is described by M. Eventually, the state of this edge enters into the closed communicating class c and can never escape from c, regardless of its initial state. Thus, if the random process defined above has only one closed communicating class, the evolutionary outcome is independent of the initial condition.
When the random process has m (> 1) closed communicating classes, we denote them by c 1 , c 2 , . . . , c m . If it is possible for an edge of E X2Y2 . We apply two propositions below: (ii) in every closed communicating class c j , there exists some i satisfying E Otherwise, an edge state lies in two different closed communicating classes, which leads to a contradiction.
Based on the proof of proposition (i), every closed communicating class includes at lease one state with the form of E AB ∈ c j . In addition, due to the strategy transition, we have E BB . Proposition (i) stresses that when an edge transitions into a state that lies in a closed communicating class like c j1 , the games to be played by the two connected players are limited by c j1 . If the edge transitions into a state in another closed communicating class like c j2 , the two connected players can only play games limited by c j2 . In particular, the games limited by c j1 and those by c j2 are completely different. The initial condition affects which closed communicating class an edge will transition into. A representative example is that for E BB ∈ c j2 , when all players play game i 1 initially, the games to be played are limited by c j1 throughout the process. However, if initially all players play game i 2 , c j2 constrains the games to be played throughout the process.
We examine the above approach with 10 8 numerical examples. In every example, we generate three 4 × 4 random matrices in which all entries are nonnegative. We normalize these matrices to make the sum of entries in each row 1. The three matrices are assigned to P (2) , P (1) , and P (0) . On the one hand, based on P (2) , P (1) , and P (0) , we calculate the eigenvalues of A P (s) − 2k 2 I /(kN ) in Eq. 42 and record whether or not there are zero eigenvalues. On the other, based on P (2) , P (1) , and P (0) , we calculateM and record whether or not there is some i such that all entries in the ith column ofM are positive. In all examples, whenever there is zero entry in each column ofM, there are zero eigenvalues. As long as there exists some i such that all entries in the ith column ofM are positive, there is no zero eigenvalue. Thus, our approach predicts well the sensitivity of the evolutionary outcomes to the initial condition. Furthermore, for P (2) , P (1) , and P (0) under which the evolutionary outcome is sensitive to the initial condition, a slight perturbation or noise to the game transition pattern (to P (2) , P (1) , and P (0) ) can turn this system into one insensitive to the initial condition. A simple way to achieve this is adding to each entry in P (2) , P (1) , and P (0) an arbitrary small number δ 1 , δ 2 , and δ 3 , respectively, where δ 1 , δ 2 , and δ 3 are not necessary identical.

Evolutionary dynamics with global game transitions
A. Global game transitions. In this section, we study evolutionary dynamics with global game transitions. In each time step, games in all interactions have chances to update. We proceed with the mathematical analysis as we do in Section 1. We take the same variables and notations (Eqs. 2-3f). The change in p A follows Eqs. 16-17d. We then investigate the change in the frequency of each type of edge. Assuming that a random B-player is selected to die, the change in p [81] Suppose that a random A-player is selected to die. Under the neighborhood configuration given in Section B.2, Eqs. 20 and 21 capture the change in p (i) AA due to edges between the focal A-player and its nearest neighbors. The change in p (i) AA due to the switching of other edges is [82] From Eqs. 18,20,21,81,82, we obtain the time derivative of p (i) AA , given bẏ [83] Analogously, we obtain the time derivatives of p (i) AB and p (i) BB , given bẏ [85] Analogous to Eqs. 37 and 38, a further analysis to Eq. 83 gives Eq. 39. We redefine the function A R (s) to be (1) 0 α/µR (2) (1 − k/µ)R (1) β/µR (0) 0 2(1 + α)/µR (1) where µ = kN . α and β follow those defined in Eq. 40. We can reduce the system of Eqs. 83-85 tȯ  In each time step, only one among N players has the chance to modify its strategy whereas all games are likely to update. The evolutionary rate of the game in an interaction (or in an edge) is N/2 times as large as that of interactants' strategies. Therefore, for sufficiently large population size N , the fractions of various games reach a stationary distribution much faster than the fractions of various strategies. For games between two A-players, the stationary distribution is u (2) . Thus, in the interaction of two A-players, the expected payoff for each A-player is n i=1 u (2) i R i . Analogously, the stationary distribution for games between an A-player and a B-player is u (1) . The expected payoff for A-player is For games between two B-players, the stationary distribution is u (0) . The expected payoff for each defector is i P i . The game transition creates a situation "as if" all players play an "effective" game, with payoff structure which holds for death-birth, imitation, pairwise-comparison, and birth-death updating. Under death-birth updating, in Eq. 53, replacing R i , S i , T i and P i withR,S,T andP , then inserting Eqs. 44a-44d, we reduce Eq. 53 to [89] For donation games with R i = b i − c, S i = −c, T i = b i and P i = 0, Eq. 89 is further reduced to Eq. 58 with [90] Similarly, under imitation updating, we can reduce Eq. 53 to A further analysis of donation games leads to Eq. 64 with [92] Under both pairwise-comparison and birth-death updating, A-players are favored over B-players if and only if A simplification for donation games gives Eq. 71 with [94] In this section, we solve u (s) = u (s) P (s) and use the limiting distribution u (s) to approximate the evolutionary process. Eqs. 83-85 actually imply this idea. For weak selection (δ 1), p BB reach the equilibrium point much faster than p A (see Eqs. 16 and 83-85). The dynamical system thus converges quickly onto the slow manifold withṗ In the righthand of Eq. 83, 1/N occurs in the third and fourth terms. For a sufficiently large population size N (N 1), the existence of 1/N may make the two terms negligible relative to the first and second terms. This inspires the idea of using n j=1 p j , we have u (2) P (2) = u (2) . The analogous analysis to Eqs. 83 and 84 gives u (1) P (1) = u (1) and u (0) P (0) = u (0) .

C. Game transitions in a fraction of interactions.
With global game transitions, games in all interactions have chances to update in each time step. With local game transitions, games in a fraction of interactions have chances to update in each time step. Note that with local game transitions, the interactions allowing for game transitions are spatially correlated-only games in interactions involved with the deceased individual's neighbors are likely to update. In this section, we assume that in each time step a fraction p (0 < p < 1) of games are randomly selected to update. In other words, in each interaction, the game has chance to update with probability p and has no chance to update with probability 1 − p. Equivalently, in each interaction, with probability p the game transitions based on the probability matrix P (s) , and with probability 1 − p the game transitions to itself. Such a situation corresponds to game transitions based on a new probability matrixP (s) , Note that the solution to u (s) = u (s)P(s) is the same as u (s) = u (s) P (s) . The setting of a fraction of games being transitioned thus leads to the same results as global transition does. D. Stochastic strategies. Up to this point, all players are assumed to be using pure strategies, namely a player cooperating (taking A) unconditionally or a player defecting (adopting B) unconditionally in each time step. In the following, we further investigate the case where players take stochastic strategies, i.e. choosing to cooperate with a probability and to defect otherwise. Let s p and s q denote two stochastic strategies. Players taking s p choose to cooperate with probability p and defect with probability 1 − p. If taking s q , players choose cooperation with probability q and defection with probability 1 − q. For two connected players, before one of them has a chance to update its strategy (i.e. s p or s q ), their actions (i.e. cooperation or defection) and games they played update many times. Therefore, in a sufficiently large population, the fractions of various interaction scenarios (consisting of two actions and the game they play) reach a stationary distribution much faster than the fractions of various strategies.
We study the competition between s p and s q . By taking p = 1 (pure cooperators) and q = 0 (pure defectors), this model can recover the case of pure strategies. In the following, we calculate the stationary distribution of interaction scenarios between players taking s p and players taking s q . Let u (pq) i,r1,r2 (t) denote the probability that in time t a player with s p chooses action r 1 , a player with s q chooses action r 2 , and they play game i, where i ∈ {1, . . . , n} and r 1 , r 2 ∈ {0, 1} (0 represents defection and 1 means cooperation). Then, we have For the game transition pattern Ω, there exists a stationary distribution and we denote it by u (pq) = (u i,r1,r2 indicates the stationary fraction of interactions in which two players play game i and the one with strategy s p chooses action r 1 and the other with strategy s q chooses action r 2 . We rewrite Eqs. 95a-95d as [96] We can get the stationary distribution u (pq) by the left eigenvector with In the interaction between a player with strategy s p and a player with strategy s q , the former's expected payoff is [98] Under death-birth updating, the condition for strategy s p to be favored over s q (i.e. ρ sp > ρ sq ) is [99] We say that a stochastic strategy is more cooperative if players with such a strategy choose cooperation with a larger probability. That is, for p > q, s p is more cooperative than s q . For donation games described by Eq. 54, Eq. 99 can be reduced to Eq. 58 with [100] Similarly, under imitation updating in donation games, natural selection favors s p over s q if Eq. 64 holds, where [101] For birth-death or pairwise-comparison updating in donation games, s p is favored over s q if Eq. 71 holds, where [102] E. Intuition based on "sigma rule". Here we provide a few new insights into how game transitions affect the evolution of A-players. In the game between A-players and B-players governed by a single payoff matrix

Tarnita et al have found that selection favors A-players over B-players if and only if
which is termed as "sigma rule" (10). The coefficient σ captures how the spatial model and its associated update rule affect evolutionary dynamics, whereas is independent of the payoffs. For an infinite random regular graph under death-birth updating, σ = (k + 1) / (k − 1). When all interactions are governed by a fixed donation game with a donation cost c and benefit b 1 , substituting R = b 1 − c, S = −c, T = b 1 and P = 0 into the sigma rule gives the condition of A-players being favored over B-players. Intriguingly, Eq. 58 can be phrased in the form of a sigma rule with R = b 1 − c + 2 k+1 n i=2 ξ i ∆b 1i , S = −c, T = b 1 and P = 0. With game transitions, evolution proceeds "as if" all interactions are governed by an effective game with payoff matrix Compared with the fixed donation game, mutual cooperation brings each player an extra payoff 2 k+1 n i=2 ξ i ∆b 1i in this effective game. This payoff depends on two factors: game transition patterns (described by ξ i ), variations in different games (described by ∆b 1i ).
For an infinite random regular graph under pairwise-comparison updating, σ = 1. Analogously, Eq. 71 can be phrased in the form of a sigma rule with R = b 1 − c + 2 n i=2 ξ i ∆b 1i , S = −c, T = b 1 and P = 0. With game transitions, evolution proceeds "as if" all interactions are governed by an effective game with payoff structure [106]

Representative examples
In Section 1 and 3, we derive the general condition of one strategy to be favored over the other strategy, which requires to solve a set of equations. In this section, we study four representative interaction scenarios and provide explicit expressions.
A. Evolutionary dynamics with state-independent game transitions. If the game to be played in the next time step is independent of the game played in the past, the game transition is state-independent. That is, p [107] For global game transitions, focusing on the game transition pattern Ω introduced in Section 3, we have [108] In particular, if p (s) m = 1/n for all m and s, the game transition is fully stochastic. In the next time step, any game occurs with the equal probability. For both local and global game transitions, weak selection favors A over B if [109] We find that the evolutionary process with stochastic and diverse games can be approximated by that of a static and unified game.

B. Evolutionary dynamics with strategy-independent game transitions.
If the game to be played in the next time step is independent of players' strategic actions in the past, the game transition is strategy-independent. That is, P (2) = P (1) = P (0) . Let P (2) = P (1) = P (0) = P. Here, we consider the game transition pattern Ω introduced in Section 3. [110] Note that in pairwise-comparison or birth-death updating, ξ i = 0 means that cooperation can never evolve regardless of the benefit provided by a cooperative behavior in the donation game. In other words, if the game is independent of strategic actions, game transitions cannot promote cooperation.
C. Evolutionary dynamics with game transitions between two states (n = 2). Given the theoretical significance of two states, we provide a systematic investigation of game transitions between two donation games. According to Eq. 58, under death-birth updating, the general rule for cooperation to be favored over defection is [111] For local game transitions, focusing on game transition patterns under which the evolutionary outcome is insensitive to the initial condition, we have We next consider a game transition pattern under which the evolutionary outcome relies on the initial condition. The game transition matrices are shown in Eq. 74. M in Eq. 73 is given by [118] Switching a few row entries and column entries, we havẽ Depending on the variations in different games, probabilistic game transitions can strengthen the promotive effects of game transitions on the evolution of cooperation in a few cases, whereas weaken them in other cases. The conclusion holds under other updating rules like imitation and pairwise-comparison updating.
For global game transitions, the related parameters are ξ 2 = (k − 1)/2 and ξ 3 = 0. In this case, probabilistic transitions do not alter the effects of game transitions on the evolution of cooperation. We study the transition between two donation games: a cooperator pays a cost c to bring its opponent a benefit b1 in game 1 or b2 in game 2; defectors forgo this donation. b1 is larger than b2. Mutual cooperation leads to game 1 and other action profiles lead to game 2. We examine death-birth (A), imitation (B), pairwise-comparison (C), and birth-death (D) updating on random regular graphs. The cross points of the dots and the horizontal lines mark the critical benefit-to-cost ratios for cooperation to be favored over defection, i.e. ρ C > ρ D , by numerical simulations. The vertical lines give the analytical critical benefit-to-cost ratios. Under death-birth and imitation updating, game transitions reduce the critical benefit-to-cost for ρ C > ρ D . Under pairwise-comparison and birth-death updating, game transitions make it possible for ρ C > ρ D . We take N = 500, k = 4, δ = 0.01, c = 1. Other parameters: b2 = b1 − 1 for death-birth and imitation updating, b2 = 4 for pairwise-comparison and birth-death updating. Each simulation runs until the population reaches fixation and each point is averaged over 10 6 runs. We study the transition between two donation games: a cooperator pays a cost c to bring its opponent a benefit b1 in game 1 or b2 in game 2; defectors pay no costs and provide no benefits. b1 is larger than b2. Mutual cooperation allows for game 1 and other action profiles lead to game 2. We examine death-birth (A), imitation (B), pairwise-comparison (C), and birth-death (D) updating on random graphs (11) and scale-free networks (12,13). The cross points of the dots and the horizontal lines mark the critical benefit-to-cost ratios for cooperation to be favored over defection by numerical simulations. The vertical lines give the analytical critical benefit-to-cost ratios based on random regular graphs. The average degree of the random regular graph and the scale-free networks is 4. Other parameters are the same as those in Fig. S1. Game transitions reduce the critical benefit-to-cost for the success of cooperators (ρ C > ρ D ). We study the transition between two donation games: a cooperator pays a cost c to bring its opponent a benefit b1 in game 1 or b2 in game 2; defectors pay no costs and provide no benefits. b1 is larger than b2. Mutual cooperation allows for game 1 and other action profiles lead to game 2. We investigate death-birth updating on random regular graphs. With probability 1 − µ, the empty site is occupied by the neighbor's offspring. With probability µ, the empty is occupied by a cooperator or a defector with equal probability. Here, the frequency of cooperative strategies f C is used to measure the success of cooperators. Cooperation is favored over defection if f C > 1/2. We obtain each data point by averaging f C in 100 independent runs. For each run, f C is obtained by averaging the frequency of cooperative strategies in the last 2 × 10 7 time steps. We take N = 500, k = 4, b2 = b1 − 1, δ = 0.01, and c = 1. Other parameters: µ = 0.05 (A), µ = 0.1 (B), and µ = 0.4 (C). The cross points of the dots and the horizontal lines mark the critical benefit-to-cost ratios for cooperation being favored over defection ( f C > 1/2). Game transitions reduce the critical benefit-to-cost for the success of cooperators.

A B C
Death-birth Imitation Pairwise comparison  We study the transition between two donation games: a cooperator pays a cost c to bring its opponent a benefit b1 in game 1 or b2 in game 2; defectors pay no costs and provide no benefits. b1 is larger than b2. Mutual cooperation allows for game 1 and other action profiles lead to game 2. We examine death-birth (A), imitation (B), and pairwise-comparison (C) on random regular graphs, random graphs, and scale-free networks. The cross points of the dots and the horizontal lines mark the critical benefit-to-cost ratios for cooperation to be favored over defection by numerical simulations. The vertical lines give the analytical critical benefit-to-cost ratios. Game transitions reduce the critical benefit-to-cost for the success of cooperators (ρ C > ρ D ). All parameters are the same as those in Fig. S1. We study the transition between two donation games: a cooperator pays a cost c to bring its opponent a benefit b1 in game 1 or b2 in game 2; defectors forgo the helping behavior. b1 is larger than b2. Mutual cooperation allows for game 1 and other action profiles lead to game 2. We examine death-birth (A) and pairwise-comparison (B) updating on random regular graphs. Under death-birth updating, a small difference between b1 and b2 (∆b = b1 − b2) greatly reduces the critical benefit-to-cost ratio (A). Under pairwise-comparison updating, the difference between games, b1 − b2, rather than the individual value of b1 and b2, determines the success of cooperators (B). Apart from b1 and b2, all other parameters follow Fig. S1.