# Computational complexity of ecological and evolutionary spatial dynamics

See allHide authors and affiliations

Edited by Christos Papadimitriou, University of California, Berkeley, CA, and approved November 10, 2015 (received for review June 10, 2015)

## Significance

An important question in evolution is: how does population structure affect the outcome of the evolutionary process? The theory of evolution in structured population has provided an impressive range of results, but an understanding of the computational complexity of even simple questions was missing. We prove that some fundamental problems in ecology and evolution can be precisely characterized by well-established computational complexity classes. This implies that the problems cannot be answered by simple equations. For example, there cannot be simple formulas for the fixation probability of a mutant given frequency-dependent selection in a structured population. We also show that, for example, calculating the molecular clock of neutral evolution in structured populations admit efficient algorithmic solutions.

## Abstract

There are deep, yet largely unexplored, connections between computer science and biology. Both disciplines examine how information proliferates in time and space. Central results in computer science describe the complexity of algorithms that solve certain classes of problems. An algorithm is deemed efficient if it can solve a problem in polynomial time, which means the running time of the algorithm is a polynomial function of the length of the input. There are classes of harder problems for which the fastest possible algorithm requires exponential time. Another criterion is the space requirement of the algorithm. There is a crucial distinction between algorithms that can find a solution, verify a solution, or list several distinct solutions in given time and space. The complexity hierarchy that is generated in this way is the foundation of theoretical computer science. Precise complexity results can be notoriously difficult. The famous question whether polynomial time equals nondeterministic polynomial time (i.e., P = NP) is one of the hardest open problems in computer science and all of mathematics. Here, we consider simple processes of ecological and evolutionary spatial dynamics. The basic question is: What is the probability that a new invader (or a new mutant) will take over a resident population? We derive precise complexity results for a variety of scenarios. We therefore show that some fundamental questions in this area cannot be answered by simple equations (assuming that P is not equal to NP).

Evolution occurs in populations of reproducing individuals. Mutation generates distinct types. Selection favors some types over others. The mathematical formalism of evolution describes how populations change in their genetic (or phenotypic) composition over time. Deterministic models of evolution are based on differential equations. They assume infinitely large population size and ignore demographic and other stochasticity. The more precise descriptions of evolutionary dynamics, however, use stochastic processes, which take into account the intrinsic randomness of when and where individuals reproduce and how many of their offspring survive. They also describe populations of finite size.

A well-known stochastic process of evolution was formulated by Moran in 1958 (1). In any one-time step, a random individual is chosen proportional to fitness for reproduction and a random individual is chosen for death. The offspring of the first individual is added to the population. The total population size remains constant and is given by *N*. The original process was formulated for constant fitness, which means the fitness value of individuals does not depend on the relative abundance of various types in the population; it is a fixed number. The crucial question is: What is the probability that a newly introduced mutant will generate a lineage that takes over the entire population? This quantity is called the fixation probability. For the original Moran process, there is a simple formula. If the resident has fitness 1 and the mutant has fitness *r*, then the fixation probability of the mutant is given by

The Moran process assumes that the biological population is well mixed. The offspring of any one individual can replace any other individual. If there is a spatial or social population structure, then such is not the case. The question arises as to which population structures affect the outcome of evolution, for example, by modifying the fixation probability. The classic studies of Maruyama (2) and Slatkin (3) showed that symmetrical population structures, such as regular lattices, do not change the fixation probability compared with the well-mixed population. The effect of population structure on evolution is also at the heart of the famous Wright–Fisher debate (4⇓–6).

In 2005, evolutionary graph theory was introduced as a tool to study how generalized population structures affect evolutionary outcome (7), and it has been studied in many other works (8⇓⇓⇓–12). The individuals occupy the vertices of the graph. The links (edges) determine who interacts with whom for receiving payoff and for reproduction. There can be a single graph for game dynamical interaction and evolutionary replacement, or the interaction and replacement graphs can be distinct (13). Often, the graph is held constant during evolutionary updating, but it is also possible to study dynamically changing graphs (14⇓⇓⇓⇓⇓⇓⇓–22). The original Moran process is recovered as a special case given by the complete graph. It turns out that all isothermal graphs, where all vertices have the same rate of updating (the same “temperature”), have the same fixation probability as the well-mixed population (7). All symmetrical population structures lead to isothermal graphs, but the converse is not true.

Many evolutionary contests, however, are not fought out with constant fitness. Instead, the fitness of individual types depends on the frequency (i.e., relative abundance) of types in the population. A well-known approach to frequency-dependent selection is evolutionary game theory (23⇓⇓⇓–27). Here, the fitnesses of individuals are linear functions of the frequencies. The coefficients of the linear function are the elements of the payoff matrix. Again, constant selection is a special case of frequency-dependent selection; for constant selection, all entries in a row of the payoff matrix are the same and this property holds for all rows. Evolutionary game theory has traditionally been investigated with deterministic equations describing infinitely large populations (23⇓⇓–26) and, more recently, has moved to finite population size and stochastic processes (27⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓–38).

Evolutionary games have a long history of being studied on spatial lattices (28, 39⇓⇓–42) and, more recently, on graphs (7, 27, 43, 44). The crucial quantity that needs to be calculated to evaluate natural selection is the fixation probability, *ρ*, of a newly introduced mutant that arises at a random position on the graph. If *N* is the population size, then natural selection favors the fixation of the new mutant, because a neutral mutant would have a fixation probability of

Spatial structure plays an important role in many cases. For example, spatial structure can affect the rate of neutral evolution (45), and there are results that describe which spatial structures do or do not affect the outcome of constant selection (46⇓–48). Some population structures can be amplifiers or suppressors of constant selection (7, 49, 50), meaning that they modify the intensity of selective differences. Finally, for evolutionary games, spatial structure can favor evolution of cooperation (28, 51).

The study of spatial dynamics also has a long tradition in ecology (52⇓⇓⇓–56). Here, the typical setting is that different species compete for ecological niches. Many evolutionary models are formally equivalent to ecological ones, especially if we consider only selection and not mutation. Then we can interpret the different types of evolutionary games as different species. Again, a crucial question is: What is the probability that a newly introduced species can get established or take over an ecological niche?

This paper is structured as follows. First, we give an intuitive account of the foundation of theoretical computer science. We describe classes of problems that can be solved by algorithms in certain time and space constraints. Subsequently, we present two simple problems of evolutionary dynamics in spatial settings. The first problem is motivated by a very simple ecological dynamic; the second problem is the general setting of evolutionary games on graphs. In both cases, the basic question is to calculate the takeover probability (or fixation probability) of a new type. That is, we introduce a new type in a random position in the population, and we ask what is the complexity of an algorithm that can characterize the probability that the new type takes over the population (becomes fixed). Unexpectedly, we are able to prove exact complexity results (Table 1).

The class PTIME (denoted as P) consists of problems whose solutions can be computed by an algorithm that uses polynomial time. Formally, an algorithm uses polynomial time if the running time of the algorithm grows as a polynomial function of the size of the input. In computer science, P represents the class of problems that can be solved efficiently.

The class nondeterministic polynomial time (denoted as NP) consists of problems for which solutions exist that are of polynomial length, and given a candidate for a solution of polynomial length, whether the candidate is indeed a solution can be checked in polynomial time. Therefore, an NP algorithm can verify a solution in polynomial time.

To proceed further, we need the notion of “reduction” between classes of problems. A reduction, from a given problem

A given problem is NP-hard if for every problem in NP, there is a polynomial reduction to the given problem. A problem is NP-complete if it is both NP-hard and there is an NP algorithm for the problem.

For example, consider a Boolean formula over variables, and the question of whether there exists an assignment to the variables such that the formula is true. A polynomial candidate solution is an assignment of truth values to variables, and given a candidate assignment, the formula can be evaluated in polynomial time. This question is the famous satisfiability (SAT) problem in computer science. The SAT problem is NP-complete.

The class P is contained in NP, and a major long-standing open question in computer science is whether P = NP. A polynomial-time algorithm for an NP-complete (or an NP-hard) problem would imply that P = NP, resolving the long-standing open problem.

The class sharp (*i*) every possible candidate for a solution is of polynomial length, and (*ii*) given a candidate for a solution, it can be checked in polynomial time whether the candidate is a solution. For example, given a Boolean formula, the problem of whether there are at least *k* distinct satisfying assignments to the formula is a

The class NP is contained in *k* distinct solutions (and the special case of

The class PSPACE consists of problems that can be solved with polynomial space. Note that a polynomial space algorithm can reuse space and can, in general, require exponential time. Every *P* = PSPACE, and a polynomial-time algorithm for a PSPACE-complete (or PSPACE-hard) problem would imply P = NP =

We have mentioned that the major questions about the equality of the complexity classes are open problems, but the widely believed conjecture is that P is strictly contained in NP, NP is strictly contained in

## Results

The first problem is motivated by ecological dynamics. There is an ecosystem occupied by resident species. The spatial structure of the ecosystem is given by a graph. An invading species is introduced (an illustration is provided in Fig. 2). We assume the invading species has a competitive advantage in the sense that once a position is occupied by the invading species, the resident cannot get it back. The invading species, however, has a density constraint: If the number of invaders around a focal invader is above a threshold, *h*, then the invader in the focal vertex cannot colonize another vertex.

We are interested in the probability that the invader starting from a random initial position will take over the entire ecosystem (and therefore drive the resident to extinction). There are two types of questions. The “qualitative question” is whether the takeover probability is greater than 0. The “quantitative question” is concerned with computing the takeover probability subject to a small error. Fig. 2 gives a pictorial illustration. We prove the following results. The qualitative question is NP-complete (*SI Appendix*, Theorem 4). The quantitative question is *SI Appendix*, Theorem 8).

The second problem is concerned with evolutionary games in structured populations. There are two types, *A* and *B*, whose reproductive rates depend on local interactions. We consider the setting of games on graphs. Each vertex is occupied by one individual, which is either *A* or *B*. Interactions occur pairwise with all neighbors. The payoff matrix is given by

We are interested in the probability that a single *A* individual starting in a random position on the graph generates a lineage that will take over the entire population; this probability is generally called fixation probability. As before, there are two types of questions. The qualitative question is whether the fixation probability is positive. The quantitative question is concerned with computing the fixation probability subject to a small error. We prove the following results. The qualitative question is NP-hard and in PSPACE. The quantitative question is *SI Appendix*, Theorems 4, 8, and 15.

Note that the first problem can also be obtained as a special case of the second problem. In the payoff matrix (1), we can set, for example, *B* never reproduces and type *A* reproduces until half of its neighbors are also of type *A*. This parameter choice leads to the same qualitative behavior and the same complexity bounds as described in the first problem.

A generalization of games on graphs is the setting where the interaction graph and the replacement graph are distinct (13). Thus, each individual interacts with all of its neighbors on the interaction graph to receive payoff. Subsequently, an individual is chosen for reproduction proportional to its fecundity. The offspring is placed randomly among all neighbors of the focal individual on the replacement graph. In this case, both the qualitative and quantitative questions become PSPACE-complete (*SI Appendix*, Theorem 15).

We also consider a variation of the second problem. In particular, we change the mapping from payoff to fecundity. We now assume that fecundity is an exponential function of payoff, and refer to it as exponential fitness (an illustration is provided in Fig. 4). Therefore, the fecundity of an individual is always positive (even if its payoff sum is negative). In this setting, the qualitative question can be decided in polynomial time. The reason is that the fixation probability is positive if the graph is connected. Thus, to answer the qualitative question, the algorithm only needs to check whether the graph is connected; this problem is in P. However, the quantitative question has the same complexity as the previous problem (*SI Appendix*, Theorems 16 and 17).

A very special case of games on graphs is constant selection. Type *A* has constant fecundity *a* and type *B* has constant fecundity *b* independent of any interactions (i.e., fecundity is independent of the population structure). The qualitative question concerning the fixation probability of *A* is in P. The quantitative question is in PSPACE, but any nontrivial lower bound is an open question.

Finally, although we establish computational hardness for several problems, we also show that two classic problems can be solved in polynomial time (*SI Appendix*, section 7). First, we consider the molecular clock, which is the rate at which neutral mutations accumulate over time. The molecular clock is affected by population structure (13). We show that the molecular clock can be computed in polynomial time because the problem reduces to solving a set of linear equalities, which can be achieved in polynomial time using Gaussian elimination. Second, we consider evolutionary games in a well-mixed population structure, where the underlying structure is the complete graph (38). We show that the exact fixation probability can be computed in polynomial time. In this case, the problem can be reduced to computing absorption probabilities in Markov chains, where each state represents the number of mutants. Hence, the Markov chain is linear in the number of vertices of the graphs, and because absorption probabilities in Markov chains can be computed in polynomial time (by solving a set of linear equalities), we obtain the desired result.

## Methods: Proof Ideas

We now present the key intuition and main ideas of our results. The most interesting and technically insightful results are the lower bounds (i.e., the hardness proofs), and we present the key ideas only for them.

### NP-Hardness of the Qualitative Ecological Problem.

One of the most classic NP-complete problems is the 3SAT-problem, which is the SAT problem where every clause has exactly three literals (a literal is a Boolean variable *x* or a negation of a variable

# P-Hardness of the Quantitative Ecological Problem.

A *k*. A key idea in our construction is that in a full binary tree, if the root becomes a mutant and every vertex can reproduce exactly once, then the set of mutants will eventually consist of a path from the root to a leaf, chosen uniformly at random. Our construction is then as follows: We have a start vertex where the mutant arises, which reproduces to turn each of the vertices on the left side of the bipartite graph to mutants. Each of the vertices on the left side is the root of a full binary tree, where the leaves correspond to the right side of the bipartite graph. We show that the fixation process corresponds to a perfect matching (defined from the path in the full binary trees), and given an approximation of the fixation probability, the exact number of perfect matchings of the bipartite graph can be computed.

### PSPACE-Hardness for the Game on Evolutionary Graph Problem.

Our PSPACE-hardness proof shows that the evolutionary process can solve the following concurrent-if problem, which we show is PSPACE-hard. The concurrent-if problem consists of a set of Boolean variables

## Discussion

In summary, we have established computational complexity results for some fundamental problems of ecological and evolutionary dynamics in structured populations. Our main results are summarized in Table 1. We now discuss the significance of our findings.

### Interdisciplinary Connection.

Although both computer science and biology examine the proliferation of information in time and space, the deep connection between them has been largely unexplored. Our work provides precise computational complexity results for several well-studied problems in biology and can be viewed as a step to establish a connection between the two disciplines.

### Well-Studied Open Problem.

The problems we have considered are basic aspects of well-studied questions for ecological and evolutionary dynamics in structured populations (7, 28, 37, 42, 50, 51). Several reviews have been written on this topic (8, 11, 43, 57). We first discuss the significance of an algorithmic approach in evolutionary graph theory. An efficient algorithm, which considers all (even worst-case) graphs for evolutionary processes, is important for the following reasons.

First, it has been shown that some population structures (called amplifiers) can increase the effect of natural selection (7, 51), but amplifiers are rare and constructing them is difficult (7, 11, 50, 51, 58, 59). If there were an efficient algorithmic approach that worked for all graphs, then one could design candidates for amplifiers and efficiently check their fixation probabilities. Because there exists no algorithmic approach, research has to focus on special classes of graphs to identify simple formulas, such as calculating the fixation probabilities on star-like graphs (58).

Second, it is known that some population structures and evolutionary dynamics promote evolution of cooperation but that others do not (51). An important open problem is to characterize the set of graphs that promote cooperation. An efficient algorithmic approach would be useful to check candidate structures. Because no efficient algorithm exists, one has to study special cases, for example, by considering nearly regular graphs (51).

Thus, a general algorithmic approach is a very important problem for the well-studied question concerning the effect of population structures on evolutionary dynamics. An algorithmic approach has been studied for important special cases, such as for complete graphs (60), and NP-hardness was stated for the quantitative problem (7). The review by Shakarian et al. (57) identifies the complexity of computing fixation probabilities on evolutionary graphs as an important open question in the area. In that review, two open problems (2.1 and 2.2) are identified that ask for the complexity of computing the exact fixation probabilities for graphs and for games on graphs. Our results not only present answers to those crucial questions but also show that both the approximation problem and the qualitative question are computationally hard. The most interesting aspects of our results are the lower bounds, which show that there exists no efficient algorithm in most cases, under the widely believed conjecture that P is different from NP. A simple equation-based solution would give an efficient algorithm; thus, our result formally shows that for evaluating the fixation probability in spatial settings, there does not exist a simple equation-based solution in general. Our results are significant for the following reasons: (*i*) They establish the computational complexity for fundamental problems of ecological and evolutionary dynamics in structured populations (e.g., considered in 7, 8, 11, 28, 37, 42, 43, 50, 51, 57), and (*ii*) they significantly improve the complexity result of Lieberman et al. (7) and solve the computational complexity questions of the area as identified in the review by Shakarian et al. (57).

### Methodological Insight.

Our proof ideas also reveal some important points. We show how evolutionary processes in structured populations can mimic aspects of computation. This insight could be useful for future research on understanding the computational complexities of other stochastic processes on population structures.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: ribsen{at}ist.ac.at.

Author contributions: R.I.-J., K.C., and M.A.N. designed research, performed research, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1511366112/-/DCSupplemental.

## References

- ↵
- ↵
- ↵
- ↵.
- Wright S

- ↵.
- Wright S

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Skyrms B,
- Pemantle R

- ↵
- ↵
- ↵.
- Antal T,
- Ohtsuki H,
- Wakeley J,
- Taylor PD,
- Nowak MA

- ↵.
- Tarnita CE,
- Antal T,
- Ohtsuki H,
- Nowak MA

- ↵
- ↵
- ↵.
- Rand DG,
- Arbesman S,
- Christakis NA

- ↵
- ↵.
- Smith JM

- ↵.
- Hofbauer J,
- Sigmund K

- ↵.
- Hofbauer J,
- Sigmund K

- ↵.
- Cressman R

- ↵.
- Broom M,
- Rychtář J

*Chapman & Hall/CRC Mathematical and Computational Biology*(Chapman & Hall/CRC, Boca Raton, FL) - ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- van Veelen M,
- García J,
- Rand DG,
- Nowak MA

- ↵
- ↵.
- Killingback T,
- Doebeli M

- ↵
- ↵
- ↵
- ↵.
- Nowak MA,
- Tarnita CE,
- Antal T

- ↵
- ↵
- ↵
- ↵.
- Maruyama T

- ↵
- ↵.
- Nowak MA,
- Michor F,
- Iwasa Y

- ↵.
- Nowak MA

- ↵
- ↵
- ↵.
- Levin SA,
- Paine RT

- ↵
- ↵.
- Tilman D,
- Kareiva PM

- ↵.
- Dieckmann U,
- Law R,
- Metz JAJ

- ↵
- ↵.
- Adlam B,
- Chatterjee K,
- Nowak MA

*Proc R Soc Lond A Math Phys Sci*471(2181):20150114 - ↵
- ↵

## Citation Manager Formats

## Article Classifications

- Biological Sciences
- Biophysics and Computational Biology