## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Evolutionary consequences of behavioral diversity

Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved September 23, 2016 (received for review June 6, 2016)

## Significance

Access to a diversity of behavioral choices makes social dynamics rich and difficult to analyze. Individuals are rarely constrained to a binary choice between “cooperate” or “defect,” as many theoretical treatments assume. Here we use game theory to ask what social behaviors will emerge in populations as the number of behavioral choices grows. We show that simple strategies, where players do not vary their behavior much at all, can nonetheless be successful, and that access to a broader range of behavioral choices can cause a population to evolve toward lower levels of cooperation. Finally, we show that access to greater choice in rock–paper–scissors games inevitably leads to behavioral diversity, with players using strategies that make use of all possible choices.

## Abstract

Iterated games provide a framework to describe social interactions among groups of individuals. This body of work has focused primarily on individuals who face a simple binary choice, such as “cooperate” or “defect.” Real individuals, however, can exhibit behavioral diversity, varying their input to a social interaction both qualitatively and quantitatively. Here we explore how access to a greater diversity of behavioral choices impacts the evolution of social dynamics in populations. We show that, in public goods games, some simple strategies that choose between only two possible actions can resist invasion by all multichoice invaders, even while engaging in relatively little punishment. More generally, access to a larger repertoire of behavioral choices results in a more ”rugged” fitness landscape, with populations able to stabilize cooperation at multiple levels of investment. As a result, increased behavioral choice facilitates cooperation when returns on investments are low, but it hinders cooperation when returns on investments are high. Finally, we analyze iterated rock–paper–scissors games, the nontransitive payoff structure of which means that unilateral control is difficult to achieve. Despite this, we find that a large proportion of multichoice strategies can invade and resist invasion by single-choice strategies—so that even well-mixed populations will tend to evolve and maintain behavioral diversity.

Diversity in social behaviors, in humans as well as across all domains of life, presents a daunting challenge to researchers who work to explain and predict individual social interactions or their evolution in populations. Iterated games provide a framework to approach this task, but determining the outcome of such games under even moderately complex, realistic assumptions—such as memory of past interactions (1⇓⇓⇓⇓⇓–7); signaling of intentions, indirect reciprocity, or identity (9⇓⇓⇓⇓⇓⇓–16); or a heterogeneous network of interactions (17⇓⇓⇓⇓⇓⇓⇓–25)—is exceedingly difficult.

Developing models that capture complex and diverse social behaviors is an important step toward quantitative, falsifiable predictions about a host of problems, such as the emergence and stability of cooperation, policing, and social institutions in human populations; and the de novo evolution of social hierarchies in natural populations (7, 9, 10, 26⇓⇓–29). Recent work has expanded the reach of game-theoretic models to describe ever more sophisticated forms of social interactions (3, 30⇓⇓⇓⇓⇓⇓⇓⇓–39). This work has begun to unravel the evolutionary and behavioral dynamics that determine the long-term stability of cooperation in a group. It has allowed researchers to explore the role of memory in social dynamics (40⇓⇓⇓–44), and it has shown that, even with multiple players (33, 38) and arbitrary action spaces (36), an individual can often unilaterally influence the outcome of social interactions across a broad range of contexts.

Here we study the evolutionary dynamics of social interactions under the quite general setting of all “memory-1” strategies—that is, strategies that specify the choice a player makes in each round of a repeated game depending on the choices made in the preceding round. We study the evolutionary dynamics of memory-1 strategies in a population of players with access to multiple behavioral choices, including games where unilateral control through so-called zero-determinant (ZD) strategies (30) is impossible.

Many game-theoretic studies of social behavior, although by no means all (36, 45, 46), constrain players to a binary behavioral choice such as “cooperate” or “defect” (47, 48). Other studies, particularly those looking at social evolution, constrain players to a single type of behavioral strategy, but allow for a continuum of behavioral choices—e.g., the option to contribute an arbitrary amount of effort to an obligately cooperative interaction (45, 46). In general, and especially in the case of human interactions, individuals have access to both a wide variety of behavioral choices, and to a complex decision-making process among these choices. Here we bridge this gap and study how the diversity of behavioral choices impacts the evolution of decision making in a replicating population, focusing on the prospects for cooperation and for the maintenance of behavioral diversity.

We develop a framework for analyzing iterated two-player games in which players can access an arbitrary number of behavioral choices and use an arbitrary memory-1 strategy for choosing among them. We apply this framework to study the effect of a large behavioral repertoire on the evolution of cooperation in public goods games. We show that increasing the number of investment levels available to a player can either facilitate or hinder the evolution of cooperation in a population, depending on the ratio of individual costs to public benefits in the game. We apply the same framework to study games with nontransitive payoff structures, under which no hierarchical ordering of payoffs is possible—e.g., the game of rock–paper–scissors in which scissors cuts paper, and paper covers rock, but rock crushes scissors. We show that nontransitive payoff structures generally preclude unilateral control through ZD strategies, but that nonetheless there exist memory-1 strategies that ensure the maintenance of behavioral diversity, in which players make use of all of the choices available to them.

## Methods and Results

Players in an iterated game repeatedly choose from a fixed set of possible actions. Depending on the choice she makes, and the choices her opponents make, a player receives a certain payoff each round. The process by which a player determines her choice each round is called her strategy. A strategy may in general take into account a wide variety of information about the environment, memory of prior interactions between players, an opponent’s identity, or his social signals, etc. (1⇓⇓⇓⇓–6, 11, 13⇓⇓–16, 20⇓⇓⇓⇓–25). Here we restrict our analysis to two-player, simultaneous iterated games in which a player chooses from among *d* possible actions each round using a memory-1 strategy, which takes into account only the immediately preceding interaction between her and her opponent. We consider games that are discounted at rate δ, where *SI Appendix*; refs. 3, 30, 35, 38).

A memory-1 strategy is specified by choosing *i*, denoted *j* and her opponent made choice *k* in the preceding round. The strategy must also specify *d* probabilities *i* in the first round of play. Each probability can be chosen independently, save for the constraint that the sum across actions

We study two qualitatively different behavioral choices that players can make: different sizes of contributions and different types of contributions to social interactions (Fig. 1). If players can vary the size of the contribution they make to a social interaction, this means that they alter the degree of their participation but not the qualitative nature of the interaction. For example, in a public goods game, a player may choose to contribute an amount *C* to the public good, or

Here we study both kinds of behavioral choice, differences in size and type, and their effects on the evolution of strategies in a population. We analyze well-mixed, finite populations of *N* players reproducing according to a copying process or pairwise comparison rule (8), in which a player *X* copies her opponent *Y*’s strategy with probability *X* from her social interactions with each of the *Y* in a population otherwise composed of strategy *X*, we have the average payoffs **2**].

### The Outcome of an Iterated *d*-Choice Game.

To analyze social evolution in multichoice iterated games we must first calculate the expected long-term payoff *X* facing an arbitrary opponent *Y*. To do this, we will generalize an approach used for two-choice two-player games, in which a player’s memory-1 strategy *d*-choice two-player game, the probability that a focal player chooses action *i*, given that she played action *j* and her opponent action *k* in the preceding round, is denoted *SI Appendix*), the probabilities *i* within her memory (which is 1 or 0 for a memory-1 strategy); a baseline rate of playing action *i*, denoted

where δ is the rate of discounting, *i* in the first round, and *SI Appendix*). Note there are *d* − 1 such equations, one for each behavioral choice

### Choosing How Much to Contribute to a Public Good.

We will use the relationship between two players’ scores (Eq. **1**) to analyze the evolution and stability of cooperative behaviors in multichoice public goods games, played in a finite population. In the two-player public goods game each player chooses an investment level, *C*, which produces a corresponding amount of public benefit that is then shared equally between both players, regardless of their investment choices. In general, if a player invests

A wide variety of evolutionary robust memory-1 strategies exist for two-choice public goods games. The character and evolvability of these strategies have been explored in detail (3, 35, 40, 42, 60⇓–62). But the assumption of only two investment levels—of two behavioral choices—is unrealistic for many applications. Even if a player adopts such a two-choice strategy, there is in general no reason for her opponent to do the same. Thus, we begin our analysis by asking whether a two-choice memory-1 strategy that stabilizes investment at the maximum level when resident in a population (and is therefore considered a “cooperative” two-choice strategy) can resist invasion against players who are allowed to make arbitrary investment choices.

For simplicity, we will focus here on a linear relationship between costs and benefits of investment in the public good, so that *SI Appendix*.

For linear benefits, a two-choice strategy is related to our alternate coordinate system according to *i* corresponds to an opponent who invests *SI Appendix*, section 3). Here we choose the boundary conditions **1** we obtain the following relationship between two players’ long-term payoffs

When player *Y* is constrained to the same two choices as player *X*, then this relationship reduces to the relationship for a two-player, two-choice game discussed in refs. 30, 31, 35, 42. However, we will consider the more general case when player *Y* has access to different, and possibly more, investment choices than player *X*. In general, a strategy *X* resident in a population of size *N* can resist selective invasion by a mutant *Y* iff

where _{N} condition (47), which defines the evolutionary stability of a resident strategy in terms of its ability to resist both invasion and replacement by a mutant. In the large space of memory-1 strategies we study here, no two-choice resident is strictly ESS_{N} (35), because any strategy can be invaded and replaced neutrally. Thus, we look for strategies that can resist selective invasion by any rare mutant, which we call evolutionary robustness (42). A cooperative two-choice strategy by definition has

Using the relationships above we can derive conditions for a two-choice cooperative strategy to be universally robust to invasion; that is, robust against all invaders *Y*, who can make an arbitrary number of different investment choices, including values above *SI Appendix*). This in turn allows us to derive the following necessary and sufficient condition for the existence of a two-choice strategy that is universally robust:

If (and only if) **3** is satisfied, then there exists a two-choice strategy that enforces cooperation at some level

The inequality in **3** offers insight into the degree of punishment that a resident cooperative strategy must be prepared to wield, to remain robust against all invaders (Fig. 2). Setting **3**, larger ratios of public benefit to individual cost *r* and larger population sizes *N* mean that smaller reductions in public investment are sufficient for universal robustness of the resident cooperator. And as Fig. 2 shows, for a wide range of parameters a population can enjoy robust cooperation using a two-choice strategy with only moderate threat of punishment, e.g.,

We can also investigate whether strategies that stabilize behavior at the lower investment level, *SI Appendix*). We find that, indeed, such strategies can also be robust, but such strategies are never of the “extortion” type (30), which is perhaps unsurprising given that extortion strategies are unstable even when invaders are limited to only two choices (39).

### Perception of Novel Actions.

In our analysis so far we have considered players that use a strategy composed of probabilities

For a resident strategy that stabilizes investment at the higher level, *SI Appendix*). Indeed, if we make the natural threshold choice

which is precisely the same as **3** (with

We have verified the condition above by numerical simulation (*SI Appendix*, Fig. S1), and we find that not only do simple, universally robust strategies of this type exist, but when they exist they are typically very common.

### Evolutionary Consequences of Multiple Investment Choices.

We now turn our attention to the implications of these results for an evolving population of players who can make *d* is large, players have more options for investment, between the fixed minimum value zero and fixed maximum value

Because all two-choice strategies form a subset of *d*-choice strategies, an evolving population of *d*-choice players has access to, at minimum, all evolutionary robust two-choice strategies. Thus, unlike in the two-choice case, where there are only three qualitatively distinct types of evolutionary robust strategies (35), a *d*-choice population may result in many different classes of evolutionary robust outcomes, most of which are suboptimal in the sense that they produce less public good than the global maximum,

We can place a lower bound on how many such suboptimal, but evolutionary robust, outcomes are possible when players have **2**. Thus, when there is no discounting (

Thus, the number of suboptimal evolutionary robust outcomes grows at least quadratically with the number of investment levels available to individuals.

Fig. 2 can now be reinterpreted as showing the proportion of pairs of investment levels that can produce a robust, suboptimal two-choice strategy for a population of

We have seen that increasing the number of available choices to players, between a fixed minimum and maximum investment level, has the potential to produce suboptimal but evolutionary robust outcomes. To test how the number of available choices impacts evolutionary dynamics in a population, we ran evolutionary simulations under weak mutation (42), with mutants drawn uniformly from all *d*-choice memory-1 strategies. We compared the mean payoffs received by populations constrained to **4**), the population that has

### Nontransitive Payoff Structures.

So far we have focused on multiple options for investment and its impact on the evolution of cooperative behaviors in public goods games. But the coordinate system we have introduced for studying multichoice iterated games, and the resulting relationship between two players’ scores (Eq. **1**), applies generally, and so it can be applied to study many other questions in evolutionary game theory. Among the most interesting questions occur with only

Games with nontransitive payoff structures, such as rock–paper–scissors, describe social dynamics without any strict hierarchy of behaviors. Individuals can invest in qualitatively different types of behavior, which dominate in some social interactions but lose out in others. Such nontransitive interactions have been observed in a range of biological systems, from communities of *Escherichia coli* species (50), to mating competition among male side-blotched lizards *Uta stansburiana* (54). Rock–paper–scissors interactions are well known in ecology as having important consequences for the maintenance of biodiversity: in well-mixed populations playing the one-shot game, diversity is often lost; whereas, in spatially distributed populations, multiple strategies can be stably maintained (55, 56). Here we analyze the equivalent problem for the maintenance of diversity in evolving populations of players who engage in iterated nontransitive interactions.

We will assess the potential for maintaining behavioral diversity in a population playing an iterated rock–paper–scissors game—that is, we look for strategies that can resist invasion by players who use a single behavioral choice (1 = rock, 2 = paper, or 3 = scissors). We assume that, in any given interaction, a fixed benefit *B* is at stake, and players invest a cost

We first consider the case of a completely symmetric game of rock–paper–scissors, with *SI Appendix*, section 4. From this coordinate system we see immediately that there exists no viable ZD strategy, with the sole exception of the singular “repeat” strategy (30). Despite the absence of ZD strategies, we can still analyze the outcome of iterated rock–paper–scissors games using this coordinate system.

### Maintaining Behavioral Diversity in a Game of Rock–Paper–Scissors.

The symmetric, iterated rock–paper–scissors game is simple to analyze, because payoff is conserved, meaning that the sum of two interacting players’ payoffs is constant, *X*. It might seem unlikely, then, that behavioral diversity offers any advantage in this situation. After all, a player who uses a strategy that employs rock, paper, and scissors produces no higher mean fitness at the population level than a player who always uses rock. To determine whether this intuition is correct, and nontransitive payoffs lead inevitably to a loss of behavioral diversity, we evaluated the conditions for a strategy to resist selective invasion by a player who always uses the same move. Such strategies do indeed exist, and satisfy the following inequality:

As one might hope, strategies that tend to switch to the move that would have won in the preceding round—corresponding to larger values of **5** also provides a more valuable insight, as it allows us to calculate the overall robustness of memory-1 strategies to the loss of behavioral diversity. To do this we calculate the probability that a randomly drawn memory-1 strategy satisfies **5**, which reveals that fully 50% of such strategies maintain behavioral diversity in the completely symmetric rock–paper–scissors game (Fig. 4). Furthermore, due to symmetry, the condition for a new strategy to invade a resident is simply *SI Appendix*). And so if a resident can resist invasion against a particular invader, it can also invade a population in which that invader is resident. Thus, 50% of strategies can successfully invade in a population that lacks behavioral diversity—so that behavioral diversity is both highly evolvable and easy to maintain in the iterated rock–paper–scissors game, even in a well-mixed population—in sharp contrast to the one-shot game.

We can also assess the robustness of behavioral diversity when the symmetry of the game is broken, so that *A* we numerically calculate the overall robustness of randomly drawn strategies as a function of the costs *B* and

## Discussion

We have studied how the repertoire of behavioral options influences the prospects for cooperation, and the maintenance of behavioral diversity, in evolving populations. Our analysis has relied on the theory of iterated games and, in particular, on a coordinate system we developed to describe strategies for multichoice games and their effects on long-term payoffs. In the context of public goods games, we have shown that simple strategies that use only two levels of investment can nonetheless stabilize cooperative behavior against arbitrarily diverse mutant invaders, provided the simple strategy has sufficient opportunity to punish defectors. More generally, a greater diversity of investment options can either facilitate or hinder the evolution of cooperation, depending on the ratio of public benefit produced to an individual’s investment cost. We have applied the same analytical framework to study more complicated multichoice iterated games with nontransitive payoffs, such as the rock–paper–scissors game. In this case, behaviorally diverse strategies that use multiple actions are often evolutionary robust, even in a well-mixed population, and they can likewise invade populations that lack diverse behaviors. Overall, the view emerges that simple behavioral interactions are sometimes surprisingly robust against diverse alternatives, and yet, in many circumstances, diverse behavior serves the mutual benefit of a population and is a likely outcome of evolution.

Our results on the impact of multiple behavioral choices should be compared with those of McAvoy and Hauert (36), who studied ZD strategies in two-player games with arbitrary action spaces. They established that ZD strategies exist even in this general setting. They focused especially on extortion strategies, whereby one player unilaterally sets the ratio of scores against her opponent. McAvoy and Hauert found, remarkably, that extortion strategies exist with support on only two actions, even against an opponent who can choose from an uncountable number of actions. Our results form an intriguing contrast to those of McAvoy and Hauert. Instead of studying ZD strategies and extortion in the classical context of two players, we have studied all memory-1 strategies and the prospects for robust cooperation in a population of

We have analyzed the entire space of memory-1 strategies for iterated multichoice games. Our ability to do so rests on a key mathematical result: the outcome of iterated games can be easily understood when players’ strategies, even those of startling complexity (3, 33, 38), are viewed in the right coordinate system. This coordinate system was suggested by the discovery of ZD strategies and developed fully by Akin (31) and others (3, 33, 35⇓–37). The purview of our analysis can be put in context by comparison with the yet wider space of long-memory strategies, on the one hand, and the smaller space of ZD strategies, on the other hand. As discussed here and elsewhere, strategies that are evolutionary robust against the full space of memory-1 strategies are also robust against all longer-memory strategies (30, 38) (*SI Appendix*), making this a natural strategy space to consider from an evolutionary perspective. Nonetheless, memory can have an important impact on the relative success of different types of robust strategies, by making them more or less evolvable (3), or by allowing qualitatively different types of decision making via tagging or kin recognition (39, 63). Conversely, it is important to consider the full space of memory-1 strategies in the context of multichoice games because, as we have shown, such games may contain no ZD strategies at all, as in the case of iterated rock–paper–scissors.

It is unsurprising, perhaps, that games with nontransitive payoffs do not generally admit the opportunity for one player to exert unilateral control over the game’s outcome via ZD strategies. After all, a player cannot successfully extort an opponent whose behavior is so diverse that it cannot be pinned down. However, our analysis also offers perspective on the problem of diversity maintenance in evolving populations. One-shot rock–paper–scissors games have long been studied in the context of evolutionary ecology as a system that cannot easily maintain diversity without spatial structure or other exogenous population heterogeneity (50, 54⇓⇓⇓⇓–59). Here, by contrast, we have shown that behaviorally diverse strategies in the iterated game can easily emerge and resist invasion by behaviorally depauperate mutants, an observation that is relevant to behavioral interactions within a single population and also to interactions between species.

Overall we have seen that, as players gain access to more behavioral choices, either due to environmental shifts or evolutionary innovation, the dynamics of social evolution can be profoundly altered. This view is reflected by empirical studies, which have found that greater behavioral choice, via factors such as the ability to communicate or signal to others, has a significant impact on the level of cooperation in a group (9⇓⇓⇓⇓⇓–15). Moving forward, we must connect the insights drawn from complex behavioral and evolutionary models of the type described here to empirical studies, where we can now seek quantitative predictions for the dynamics of group behavior in real populations.

## Acknowledgments

We thank two anonymous referees for constructive suggestions. A.J.S. gratefully acknowledges funding from the Royal Society (UF140346); T.L.P. from the Centre National de la Recherche Scientifique; and J.B.P. from the David & Lucile Packard Foundation, the U.S. Army Research Office (W911NF-12-1-9552), and the Defense Advanced Research Projects Agency NGS2 program (Grant D17AC00005).

## Footnotes

- ↵
^{1}To whom correspondence may be addressed: Email: jplotkin{at}sas.upenn.edu or alex.stewart{at}ucl.ac.uk.

Author contributions: A.J.S., T.L.P., and J.B.P. designed research, performed research, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1608990113/-/DCSupplemental.

Freely available online through the PNAS open access option.

## References

- ↵.
- Hauert CHS

- ↵.
- Milinski M,
- Wedekind C

- ↵
- ↵
- ↵
- ↵
- ↵.
- Rand DG,
- Dreber A,
- Ellingsen T,
- Fudenberg D,
- Nowak MA

- ↵
- ↵.
- Hauser OP,
- Rand DG,
- Peysakhovich A,
- Nowak MA

- ↵
- ↵.
- Nowak MA

- ↵
- ↵.
- Hauert C,
- Traulsen A,
- Brandt H,
- Nowak MA,
- Sigmund K

- ↵
- ↵
- ↵.
- Bergstrom CT,
- Számadó S,
- Lachmann M

- ↵.
- Rand DG,
- Arbesman S,
- Christakis NA

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Ostrom E

- ↵.
- Gavrilets S

- ↵.
- Raihani NJ,
- McAuliffe K

- ↵
- ↵.
- Press WH,
- Dyson FJ

- ↵.
- Akin E

- ↵
- ↵.
- Hilbe C,
- Wu B,
- Traulsen A,
- Nowak MA

- ↵.
- Stewart AJ,
- Plotkin JB

- ↵.
- Stewart AJ,
- Plotkin JB

- ↵.
- McAvoy A,
- Hauert C

- ↵
- ↵
- ↵
- ↵.
- Hilbe C,
- Nowak MA,
- Sigmund K

- ↵
- ↵.
- Stewart AJ,
- Plotkin JB

- ↵
- ↵
- ↵
- ↵.
- Doebeli M,
- Hauert C,
- Killingback T

- ↵.
- Nowak MA

- ↵.
- Sigmund K

- ↵
- ↵
- ↵.
- Allen B,
- Gore J,
- Nowak MA

- ↵.
- Cordero OX,
- Ventouras LA,
- DeLong EF,
- Polz MF

- ↵
- ↵
- ↵
- ↵
- ↵.
- Szolnoki A, et al.

- ↵
- ↵
- ↵
- ↵
- ↵.
- Axelrod R,
- Hamilton WD

- ↵