# An oscillating tragedy of the commons in replicator dynamics with game-environment feedback

See allHide authors and affiliations

Edited by Alan Hastings, University of California, Davis, CA, and approved September 27, 2016 (received for review March 11, 2016)

## Significance

Classical game theory addresses how individuals make decisions given suitable incentives, for example, whether to use a commons rapaciously or with restraint. However, classical game theory does not typically address the consequences of individual actions that reshape the environment over the long term. Here, we propose a unified approach to analyze and understand the coupled evolution of strategies and the environment. We revisit the originating tragedy of the commons example and evaluate how overuse of a commons resource changes incentives for future action. In doing so, we identify an oscillatory tragedy of the commons in which the system cycles between deplete and replete environments and cooperation and defection behavior, highlighting new challenges for control and influence of feedback-evolving games.

## Abstract

A tragedy of the commons occurs when individuals take actions to maximize their payoffs even as their combined payoff is less than the global maximum had the players coordinated. The originating example is that of overgrazing of common pasture lands. In game-theoretic treatments of this example, there is rarely consideration of how individual behavior subsequently modifies the commons and associated payoffs. Here, we generalize evolutionary game theory by proposing a class of replicator dynamics with feedback-evolving games in which environment-dependent payoffs and strategies coevolve. We initially apply our formulation to a system in which the payoffs favor unilateral defection and cooperation, given replete and depleted environments, respectively. Using this approach, we identify and characterize a class of dynamics: an oscillatory tragedy of the commons in which the system cycles between deplete and replete environmental states and cooperation and defection behavior states. We generalize the approach to consider outcomes given all possible rational choices of individual behavior in the depleted state when defection is favored in the replete state. In so doing, we find that incentivizing cooperation when others defect in the depleted state is necessary to avert the tragedy of the commons. In closing, we propose directions for the study of control and influence in games in which individual actions exert a substantive effect on the environmental state.

Game theory is based on the principle that individuals make rational decisions regarding their choice of actions given suitable incentives (1, 2). In practice, the incentives are represented as strategy-dependent payoffs. Evolutionary game theory extends game-theoretic principles to model dynamic changes in the frequency of strategists (3). Replicator dynamics is one commonly used framework for such models. In replicator dynamics, the frequencies of strategies change as a function of the social makeup of the community (4⇓–6). For example, in a snowdrift game (also known as a hawk–dove game), individuals defect when cooperators are common but cooperate when cooperators are rare (2). As a result, cooperation is predicted to be maintained among a fraction of the community (4, 6). In contrast, in the prisoner’s dilemma (PD), individuals are incentivized to defect irrespective of the fraction of cooperators. This leads to domination by defectors (6, 7).

Here, we are interested in a different kind of evolutionary game in which individual action modifies both the social makeup and environmental context for subsequent actions. Strategy-dependent feedback occurs across scales from microbes to humans in public good games and in commons’ dilemmas (8⇓⇓–11). Among microbes, feedback may arise due to fixation of inorganic nutrients given depleted organic nutrient availability (12, 13), the production of extracellular nutrient-scavenging enzymes like siderophores (14⇓–16) or enzymes like invertase that hydrolyze diffusible products (17), and the release of extracellular antibiotic compounds (18). The incentive for public good production changes as the production influences the environmental state. Such joint influence occurs in human systems, for example, when individuals decide to vaccinate or not (19⇓–21). Decisions not to vaccinate have been linked most recently to outbreaks of otherwise-preventable childhood infectious diseases in northern California (22). These outbreaks modify the subsequent incentives for vaccination. Such coupled feedback also arises in public goods dilemmas involving water or other resource use (23). In a period of replete resources, there is less incentive for restraint (24). However, overuse in times of replete resource availability can lead to depletion of the resource and changes in incentives.

In this manuscript, we propose a unified approach to analyze and understand feedback-evolving games (Fig. 1). We term this approach “coevolutionary game theory,” denoting the coupled evolution of strategies and environment. The key conceptual innovation is to extend replicator dynamics (4) to include dynamical changes in the environment. In that sense, our approach is complementary to recent efforts to consider the evolvability of payoffs in a fixed environment (25). Here, changes in the environment modulate the payoffs. In so doing, we are able to address problems in which individual behavior constitutes a nonnegligible component of the system. As a case study, we revisit the originating tragedy of the commons example (24) and ask: what happens if overexploitation of a resource changes incentives for future action? As we show, the cumulative feedback of decisions can subsequently alter incentives leading to new dynamical phenomena and new challenges for control.

## Methods and Results

### The Context: Evolutionary Game Theory as Modeled via Replicator Dynamics.

Here, we introduce evolutionary game theory in the context of the PD as a means to motivate our coevolutionary game theory formalism. Consider a symmetric two player game with strategies *C* and *D*, denoting cooperation and defection, respectively. A standard instance of the payoff matrix is the PD in which the payoffs can be written as follows:

In this game, player *C* receives a payoff of 3 and 0 when playing against players *C* and *D*, respectively. Similarly, player *D* receives a payoff of 5 and 1 when playing against players *C* and *D*, respectively. These payoffs are commonly referred to as the reward for cooperation,

In evolutionary game theory, such payoffs can be coupled to the changes in population or strategy frequencies,

where

and the average fitness is as follows:

Because

The replicator dynamics for the PD in Eq. 7 has three fixed points, but only two in the domain *D* players will, over time, change to one with a minority of *C* players, and the elimination of *C* players altogether.

In general, replicator dynamics for symmetric two-player games with a fixed payoff matrix can be written as follows:

where the convention is again that

where

Again, the stability can be identified from the sign of the cubic, that is, *C* players will, over time, change to one with a majority of *C* players, and eventually the elimination of *D* players.

### A Model of Replicator Dynamics with Feedback-Evolving Games.

We consider a modified version of the standard replicator dynamics in which

where **11** ensures that the environmental state is confined to the domain *ε* denotes the relative speed by which individual actions modify the environmental state. What distinguishes the model is that the payoff matrix *n*. The environmental state changes as a result of the actions of strategists, such that the sign of *n* will decrease or increase, corresponding to environmental degradation or enhancement when *ε*, such that when

Initially, we evaluate this class of feedback-evolving games via the use of the following environment-dependent payoff matrix:

We retain the assumption of the prior section that

The payoff matrix

in which

The payoff-dependent fitnesses are as follows:

given

There are five fixed points of this model of replicator dynamics with feedback-evolving games. Of these, four represent “boundary” fixed points, that is, (*i*) *ii*) *iii*) *iv*) *SI Appendix A*, we prove that all of the boundary fixed points are unstable and the interior fixed point is neutrally stable. Furthermore, we show that the system has a constant of motion. As a consequence, the global dynamics correspond to closed periodic orbits in the interior of the domain for any initial condition in which

The orientation of orbits in the phase plane defined by

### Generalized Conditions for an Oscillating Tragedy of the Commons.

Here, we generalize our analysis by considering a model of replicator dynamics with feedback-evolving games with asymmetric payoffs:

where

As before, this system has five fixed points, of which four correspond to unstable fixed points on the boundary. In *SI Appendix B*, we prove that the system has an unstable interior fixed point when

Here, the qualitative outcomes depend on both the sign and magnitude of differences between payoffs, rather than the signs alone. Numerical simulations exhibit oscillations when Eq. **19** is satisfied (see Fig. 3, *Top*, for an example). The numerical simulations also indicate that the oscillations grow in magnitude, possibly approaching the boundary. This observation raises a question: do the asymptotic dynamics converge to a limit cycle in the interior or to a heteroclinic cycle on the boundary?

In *SI Appendix C*, we prove that the oscillations converge to an asymptotically stable, “heteroclinic cycle” and not to a limit cycle. We do so by leveraging conditions for the emergence of heteroclinic cycles in replicator dynamics (26, 27). The heteroclinic cycle includes the four fixed points on the boundary in this order,

The finding of an asymptotically stable heteroclinic cycle implies that dynamics initialized at an interior point and not located at a fixed point will approach the boundary. Over time, the dynamics will spend an ever-increasing amount of time near a fixed point before “hopping” to the subsequent fixed point in the cycle. Near these fixed points, the nonlinear dynamics are governed by linearized dynamics and two characteristic eigenvalues, one associated with an attractive “pull” toward the fixed point and one associated with a repelling “push” away from the fixed point. Hence, Eq. **19** can be interpreted as denoting the relative strength of the product of the stable (pull) vs. unstable (push) eigenvalues around the cycle (see *SI Appendix C* for details). A stable heteroclinic cycle emerges when the pull toward the fixed points is stronger than the push away from the fixed points in a cycle. We also find that dynamics converge to an interior fixed point when Eq. **19** is not satisfied (see Fig. 3, *Bottom*, for an example). In this event, the tragedy of the commons is averted and the system does not inevitably reach the depleted state.

### Fast–Slow Oscillatory Dynamics in Feedback-Evolving Games.

To provide further intuition, we leverage the fact that when *x* is the “fast” variable and *n* is the “slow” variable (28). Later, we will show that insights gained in the limit case hold irrespective of the relative rate change of environment and strategies. Consider a rescaling of time such that **17** as follows:

where the *τ*. For *n* as the slow variable. Let *x* are much faster than that of *n*, that is, by a factor of order *i*) *ii*) *iii*) the set of points *x*. We assume that *n* parameterizes the dynamics of *x* far from the critical manifold.

As an example, consider the following payoff matrix:

Given the payoff matrix in Eq. **21**, the one-dimensional fast subsystem can be written as follows:

In this case, the interior critical manifold satisfies

System dynamics can be understood in terms of a series of fast and slow changes. Consider initializing the system at values *x*, parameterized by the value *x*—completing the cycle. The resulting dynamics will appear as relaxation oscillations with slow changes in environment alternating with rapid changes in the fraction of cooperators. The dynamics overlaid with the critical manifolds for this system are shown in Fig. 3, *Top*. We find that the system dynamics will asymptotically approach a heteroclinic cycle when **19** (*SI Appendix C* and *SI Appendix D*).

The key to the emergence of relaxation oscillations is that the interior critical manifold is a repeller. This is not universally the case. A counterexample is when Eq. **19** is not satisfied, for example:

In this example, the overall dynamics converge to an asymptotically stable interior fixed point. The overall dynamics are again characterized by a mix of fast and slow changes; however, they spiral in to the interior fixed point rather than away from it. The dynamics overlaid with the critical manifolds for this system are shown in Fig. 3, *Bottom*. We also find that the qualitative outcomes do not depend on the speed of the feedback (see Fig. 4 and *SI Appendix B* and *SI Appendix C* for proof). Here, the invariance arises because of the separability of the dynamics so that the stability of the system is unaffected by the speed. This *ε*-invariance of qualitative outcomes is not universally the case for fast–slow systems (28).

### Generalized Conditions for Mitigating the Tragedy of the Commons Given Feedback-Evolving Games.

The previous section identified conditions under which the tragedy of the commons is averted. In particular, the system converges to an intermediate environmental state when the cumulative strength of unstable eigenvalues around the cycle exceeds that of the stable eigenvalues (Eq. **19**). This condition requires that mutual cooperation is a unique Nash equilibrium in a depleted state. Here, we ask: are there any other conditions in which a tragedy of the commons could be averted? To do so, we continue to fix the payoff structure of

There are four cases to consider corresponding to different combinations among the relative values of *n*. The population will converge to

In *SI Appendix D*, we find the fixed points and local stability for all values of payoffs of

We summarize all possible dynamics in terms of a phase plane in Fig. 5. The key point is that the tragedy of the commons can be averted when there is feedback between strategy and the environment. Convergence to an intermediate environmental state depends on the magnitude of payoffs in depleted environments as well as the relative strength at which cooperators enhance the environment. In this model, incentivizing the payoff to cooperate when others defect in a bad environment can help avert the tragedy of the commons.

## Discussion

We proposed a coevolutionary game theory that incorporates the feedback between game and environment and between environment and game. In so doing, we extended replicator dynamics to include feedback-evolving games. This extension is facilitated by assuming the state-dependent payoff matrix can be represented as a linear combination of two different payoff matrices. Motivated by the study of the tragedy of the commons in evolutionary biology (29), we demonstrated how alternative dynamics can arise when cooperators dominate in deplete environments and when defectors dominate in replete environments. In essence, cooperators improve the environment, leading to a change in incentives that shifts the optimal strategy toward defection. Repeated defections degrade the environment, which reincentivizes the emergence of cooperators. In this way, there is the potential for a sustained cycle in strategy and environmental state, that is, an oscillating tragedy of the commons (Figs. 3 and 4). Whether or not the cycle dies out or is persistent depends on the magnitude of payoffs. We also identified conditions under which a tragedy of the commons can be averted (Fig. 5).

Our proposed replicator model with feedback-evolving games considers the consequences of repetition in which the repeated actions of the game influences the environment in which the game is played. Thus, the model is complementary to long-standing interest in a different kind of repeated games, most famously the iterated PD (7, 30⇓⇓⇓⇓–35). In such iterated games, strategies that include cooperation emerge, even if cooperation is otherwise a losing strategy in single-stage or one-shot versions of the game. Here, individuals do not play against another repeatedly or, posed alternatively, do not “recall” playing against another repeatedly. Instead, a feedback-evolving game changes with time as a direct result of the accumulated actions of the populations.

The feedback-evolving game analyzed here is closest in intent to a prior analysis of coupled strategy and environmental change in the context of durable public goods games (36). That model assumed that fitness differences between producers and nonproducers had no frequency dependence and the environmental dynamics had at most a single fixed point. Unlike the present case, the model in ref. 36 did not exhibit persistent oscillations. Here, the long-term dynamics depend on the magnitudes of payoffs, that is, including both costs, benefits, and frequencies of alternate strategies, as well as the strength of feedback. For example, bacteria that produce a costly public good, that is, cooperators, may have a selective advantage in a depleted environment when public goods are scarce. Cooperating bacteria can restore the availability of the public good, for example, fixed nitrogen or excreted enzymes, thereby favoring defectors that benefit from but do not produce the public good. The emergence of defectors can, with time, degrade the environment.

We have shown (Figs. 2–4) that repeated oscillations of strategies and environmental state can arise when cooperation is favored in the depleted state. We have also classified the behavior of the model given all possible payoff matrices in the depleted state (Fig. 5 and *SI Appendix D*). In all other instances, we find that global dynamics converge to a fixed point. This fixed point can be in the interior, that is, corresponding to averting the tragedy of the commons. Averting the tragedy of the commons is possible, although not guaranteed, so long as cooperation is favored when others defect in the depleted state. The conditions for averting the tragedy of the commons in this model depends on the strength but not the speed of coupling. Alternative forms of feedback between strategy and environment (37, 38) as well as nonlinear combinations of payoff matrices may lead to novel dynamics. Density-dependent interactions may also lead to novel effects of social behavior on total population densities, not just their frequencies (39).

Thus far, we have assumed that the environment can recover from a nearly deplete state. The rate of renewal was assumed to be proportional to the cooperator fraction. In that sense, our work also points to new opportunities for control—whether for renewable or finite resources. Is it more effective to influence the strategists, the state, and/or the feedback between strategists and state? Analysis of feedback-evolving games could also have implications for theories of human population growth (40), ecological niche construction (41), and the evolution of strategies in public good games (25). The extension of the current model to microbial and human social systems may deepen understanding of the short- and long-term consequences of individual actions in a changing and changeable environment (42). We are hopeful that recognition and analysis of the feedback between game and environment can help to more effectively manage and restore endangered commons.

## Acknowledgments

We thank Michael Cortez, Jeff Shamma, and two anonymous reviewers for their comments on the manuscript, particularly the suggestion of one reviewer to investigate the asymptotic nature of oscillations in this model. This work was supported by Army Research Office Grant W911NF-14-1-0402 (to J.S.W.). J.S.W. thanks Joel Cohen for feedback on an early draft of the manuscript and the organizers of the 2014 National Academies Keck Futures Initiative workshop on “Collective Behaviors: From Cells to Societies,” where work on this project began.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: jsweitz{at}gatech.edu.

Author contributions: J.S.W. designed research; J.S.W., C.E., K.P., S.P.B., and W.C.R. performed research; J.S.W., C.E., and K.P. contributed new reagents/analytic tools; C.E., K.P., S.P.B., and W.C.R. contributed to writing the paper; and J.S.W. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1604096113/-/DCSupplemental.

Freely available online through the PNAS open access option.

## References

- ↵
- ↵
- ↵.
- Maynard Smith J

- ↵.
- Hofbauer J,
- Sigmund K

- ↵.
- Nowak MA,
- Sigmund K

- ↵
- ↵.
- Axelrod R,
- Hamilton WD

- ↵.
- Frank S

- ↵
- ↵
- ↵.
- Levin SA

- ↵
- ↵.
- Barea J-M,
- Pozo MJ,
- Azcón R,
- Azcón-Aguilar C

- ↵.
- West SA,
- Buckling A

- ↵.
- Kümmerli R,
- Brown SP

- ↵
- ↵
- ↵
- ↵.
- Bauch CT,
- Galvani AP,
- Earn DJD

- ↵.
- Bauch CT,
- Earn DJD

- ↵.
- Galvani AP,
- Reluga TC,
- Chapman GB

- ↵.
- Lieu TA,
- Ray GT,
- Klein NP,
- Chung C,
- Kulldorff M

- ↵.
- Ostrom E

- ↵.
- Hardin G

- ↵.
- Stewart AJ,
- Plotkin JB

- ↵.
- Hofbauer J

- ↵
- ↵.
- Berglund N,
- Gentz B

- ↵
- ↵.
- Axelrod R

- ↵
- ↵.
- Press WH,
- Dyson FJ

- ↵.
- Stewart AJ,
- Plotkin JB

- ↵.
- Hilbe C,
- Nowak MA,
- Sigmund K

- ↵.
- Hilbe C,
- Wu B,
- Traulsen A,
- Nowak MA

- ↵
- ↵.
- Antonioni A,
- Martinez-Vaquero LA,
- Mathis N,
- Stella LPM

- ↵
- ↵
- ↵.
- Cohen JE

- ↵.
- Odling-Smee FJ,
- Laland KN,
- Feldman MW

- ↵.
- Levin SO

## Citation Manager Formats

## Article Classifications

- Biological Sciences
- Evolution