## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Exact solution for a metapopulation version of Schelling’s model

Contributed by Richard Durrett, August 4, 2014 (sent for review July 23, 2014)

## Significance

More than 40 y ago, Schelling introduced one of the first agent-based models in the social sciences. The model showed that even if people only have a mild preference for living with neighbors of the same color, complete segregation will occur. This model has been much discussed by social scientists and analyzed by physicists using analogies with spin-1 Ising models and other systems. Here, we study the metapopulation version of the model, which mimics the division of a city into neighborhoods, and we present the first analysis to our knowledge that gives detailed information about the structure of equilibria and explicit formulas for their densities.

## Abstract

In 1971, Schelling introduced a model in which families move if they have too many neighbors of the opposite type. In this paper, we will consider a metapopulation version of the model in which a city is divided into *N* neighborhoods, each of which has *L* houses. There are *ρNL* red families and *ρNL* blue families for some *ρ* < 1/2. Families are happy if there are ≤*ρ*_{c}*L* families of the opposite type in their neighborhood and unhappy otherwise. Each family moves to each vacant house at rates that depend on their happiness at their current location and that of their destination. Our main result is that if neighborhoods are large, then there are critical values *ρ*_{b} < *ρ*_{d} < *ρ*_{c}, so that for *ρ* < *ρ*_{b}, the two types are distributed randomly in equilibrium. When ρ > *ρ*_{b}, a new segregated equilibrium appears; for *ρ*_{b} < *ρ* < *ρ*_{d}, there is bistability, but when *ρ* increases past *ρ*_{d} the random state is no longer stable. When *ρ*_{c} is small enough, the random state will again be the stationary distribution when *ρ* is close to 1/2. If so, this is preceded by a region of bistability.

In 1971, Schelling (1) introduced one of the first agent-based models in the social sciences. Families of two types inhabit cells in a finite square, with 25–30% of the squares vacant. Each family has a neighborhood that consists of a 5 × 5 square centered at their location. Schelling used a number of different rules for picking the next family to move, but the most sensible seems to be that we pick a family at random on each step. If the fraction of neighbors of the opposite type is too large, then they move to the closest location that satisfies their constraints. Schelling simulated this and many other variants of this model (using dice and checkers) to argue that if people have a preference for living with those of their own color, the movements of individual families invariably led to complete segregation (2).

As Clark and Fossett (3) explain “The Schelling model was mostly of theoretical interest and was rarely cited until a significant debate about the extent and explanations of residential segregation in US urban areas was engaged in the 1980s and 1990s. To that point, most social scientists offered an explanation that invoked housing discrimination, principally by whites.” At this point Schelling’s article has been cited more than 800 times. For a sampling of results from the social sciences literature, see Fossett’s lengthy survey (4) or other more recent treatments (5⇓–7). About 10 y ago, physicists discovered this model and analyzed a number of variants of it using techniques of statistical mechanics (8⇓⇓⇓⇓⇓–14). However, to our knowledge, the only rigorous work is ref. 15, which studies the 1D model in which the threshold for happiness is *ρ*_{c} = 0.5 and two unhappy families within distance *w* swap places at rate 1.

Here, we will consider a metapopulation version of Schelling’s model in which there are *N* neighborhoods that have *L* houses, but we ignore spatial structure within the neighborhoods and their physical locations. We do this to make the model analytically tractable, but these assumptions are reasonable from a modeling point of view. Many cities in the United States are divided into neighborhoods that have their own identities. In Durham, these neighborhoods have names like Duke Park, Trinity Park, Watts-Hillendale, Duke Forest, Hope Valley, Colony Park, etc. They are often separated by busy roads and have identities that are reinforced by e-mail newsgroups that allow people to easily communicate with everyone in their neighborhood. Because of this, it is the overall composition of the neighborhood that is important not just the people who live next door. In addition, when a family decides to move they can easily relocate anywhere in the city.

Families, which we suppose are indivisible units, come in two types that we call red and blue. There are *ρNL* of each type, leaving (1 − 2*ρ*)*NL* empty houses. This formulation was inspired by Grauwin et al. (16), who studied segregation in a model with one type of individual whose happiness is given by a piecewise linear unimodal function of the density of occupied sites in their neighborhood. To define the rules of movement, we introduce the threshold level *ρ*_{c} such that a neighborhood is happy for a certain type of agent if the fraction of agents of the opposite type is ≤*ρ*_{c}. For each family and empty house, movements occur at rates that depend on the state of the source and destination houses:

where *q*, *r* < 1, and *ϵ* are small, e.g., 0.1 or smaller. Because there are *O*(*NL*) vacant houses, dividing the rates by *NL* makes each family moves at a rate *O*(1). Because *ϵ* is small, happy families are very reluctant to move to a neighborhood in which they would be unhappy, whereas unhappy families move at rate 1 to neighborhoods that will make them happy. As we will see later, the equilibrium distribution does not depend on the values of the rates *q* and *r* for transitions that do not change a family’s happiness. We do not have an intuitive explanation for this result.

## Convergence to a Deterministic Limit

To describe the dynamics more precisely, let *n*_{i,j}(*t*), be the number of neighborhoods with *i* red and *j* blue families for (*i*, *j*) ∈ Ω = {(*i*, *j*): *i*, *j* ≥ 0, *i* + *j* ≤ *L*}. The configuration of the system at time *t* can be fully described by the numbers *L* fixed and let *N* → ∞, then

Motivated by individual-based models in finance, Remenik (17) proved a general result that takes care of our example. To describe the limit, we need some notation. Let λ(*a*_{1}, *b*_{1}; *a*_{2}, *b*_{2}) be *N* times the total rate of movement from one (*a*_{1}, *b*_{1}) neighborhood to one (*a*_{2}, *b*_{2}) neighborhood. Let *N* times the rate at which a movement from one ω_{1} = (*a*_{1}, *b*_{1}) neighborhood to one *ω*_{2} = (*a*_{2}, *b*_{2}) neighborhood turns the pair ω_{1}, ω_{2} into *SI Text*.

**Theorem 1.** *As N* → ∞ *the* *converge in probability to the solution* *of the ordinary differential equation:**ω*′ because its value is determined by *ω*_{1} and *ω*_{2}. The first term comes from the fact that a migration from (*i*, *j*) → *ω* or *ω* → (*i*, *j*) destroys an (*i*, *j*) neighborhood, whereas the second reflects the fact that a migration *ω* → *ω*′ may create an (*i*, *j*) neighborhood at the source or at the destination.

## Special Case *L* = 2

To illustrate the use of Theorem 1, we consider the case *L* = 2, and let ℓ_{c} = [*ρ*_{c}*L*] be the largest number of neighbors of the opposite type that allows a family to be happy. Here, [*x*] is the largest integer ≤*x*. When *L* = 2, a neighborhood with both types of families must be (1, 1), so the situation in which ℓ_{c} ≥ 1 is trivial because there are never any unhappy families. In the case *L* = 2 and ℓ_{c} = 0, it is easy to find the equilibrium because there is detailed balance, i.e., the rate of each transition is exactly balanced by the one in the opposite direction.*SI Text*), we find that this holds if and only if*r* has nothing to do with the fixed point, but if you look at the first two equations you see that the *r* appears on both sides. The parameter *q* does not appear either, but when *L* = 2, it is for the trivial reason that transitions (1, 1)(1, 0) → (1, 0)(1, 1), which occur at rate *q*, do not change the state of the system.

Using now the fact that the equilibrium must preserve the red and blue densities, we can solve for *x* and *y* to conclude that*SI Text* that it is the only fixed point. Because the formulas, which result from solving a quadratic equation, are somewhat complicated, Fig. 1 shows how the equilibrium probabilities *ν*_{i,j} vary as a function of *ρ*.

Unfortunately, when *L* ≥ 3, there is no stationary distribution that satisfies detailed balance. One can, of course, solve for the stationary distribution numerically. Fig. 2 shows the limit behavior of the system with *L* = 20 and *ρ*_{c} = 0.3, i.e., ℓ_{c} = 6 for initial densities *ρ* = 0.1, 0.2, 0.25, and 0.35. In the first two cases, most of the families are happy. In the third situation, the threshold is ℓ_{c} = 6, whereas the average number of reds and blues per neighborhood is five, but because fluctuations in the makeup of neighborhoods can lead to unhappiness, there is a tendency toward segregation. In the fourth case, segregation is almost complete, with most neighborhoods having 0 or 1 of the minority type.

## Outline of Our Solution

Finding the stationary distribution requires solving roughly *L*^{2}/2 equations. To be precise, there are 231 equations when *L* = 20 and 5,151 when *L* = 100. In this section, we will adopt a different approach: we concentrate on the evolution of neighborhood 1 and consider the other *N* − 1 neighborhoods to be its environment, which can be summarized by the following four parameters: (*i*) the average number of happy red and blue families per neighborhood, *ii*) the average number of vacant sites happy for red or blue,

The first in our solution is to identify the stationary distribution of the neighborhood–environment chain that are self-consistent. That is, if we calculate the expected values of *H* and 1 for *U*, we have

If we let *Tri*(*p*_{R}, *p*_{B}) be the trinomial distribution*Q*_{k,}_{ℓ}, the detailed balance condition is satisfied by *Tri*(*p*_{R}, *p*_{B}) where*α*_{k,}_{ℓ} and *β*_{k,}_{ℓ} are given in *SI Text*.

Unfortunately, there is no stationary distribution that satisfies detailed balance on the entire state space. To verify this, we note that if the stationary distribution π and the jump rates *q* satisfy detailed balance π(*x*)*q*(*x*, *y*) = π(*y*)*q*(*y*, *x*), then around any closed path *x*_{0}, *x*_{1},…*x*_{n} = *x*_{0} with *q*(*x*_{i−1}, *x*_{i}) > 0 for 1 ≤ *i* ≤ *n*, we must have

**Assumption 1.** *Stationary distributions are symmetric under interchange of red and blue.*

The answer given in the next section is a one-parameter family of stationary distributions indexed by *a* ∈ [0, 1/2]. There, and in the next two steps, we have a pair of results: one for *ρ* < *ρ*_{c} and one for *ρ* > *ρ*_{c}.

In the second step, we investigate the flow of probability between quadrants when transitions between the quadrants are restored. The key idea is that the measures in each quadrant are trinomial, so the probabilities will decay exponentially away from the mean (*p*_{R}*L*, *p*_{B}*L*). This observation implies that the flow between quadrants occurs at rate exp(−*cL*), which is much smaller than the time, *O*(1), it takes the probability distributions to reach equilibrium. In words, the process comes to equilibrium on a fast time scale, whereas the parameters change on a much slower one. We will prove this separation of time scales in a version of the paper for a mathematical audience. Here, we will only give the answers that result under

**Assumption 2.** *The process is always in one of self-consistent stationary distributions, but the value of a changes over time.*

The third step is to use the stability results to show that the only possible stable equilibria are at the following end points: *a* = 0, which represents a random distribution; and *a* = 1/2, which represents a segregated state.

We will call these measures *μ*_{r} and *μ*_{s} when *ρ* < *ρ*_{c} and *ρ* > *ρ*_{c}. Theorems 4A and 4B describe when they are stable fixed points.

## Self-Consistent Stationary Distributions

The results given here are proved in *SI Text*. The formulas, which again come from solving a quadratic equation, are not simple but they are explicit and easily evaluated.

**Theorem 2A.** *Suppose ρ* < *ρ*_{c}*. For a* ∈ (0, 1/2] *let**where**ρ*_{1}(0, *ρ*) = lim_{a→0} *ρ*_{1}(*a*, *ρ*) = *ρ*/[1 − *ρ*(1 − *ϵ*)] and for *a* ∈ [0, 1/2] let

*A symmetric distribution μ is self-consistent if and only if it has the form above with parameters* *ρ*_{1} *>* *ρ*_{c}*,* *ρ*_{2} *=* *ϵρ*_{1} *<* *ρ*_{c}*,* *and* *ρ*_{0} *=* *ρ*_{1}*/[1 + (1 −* *ϵ**)ρ _{1}] <*

*ρ*

_{c}

*.*

To clarify the last sentence: the definition of *ρ*_{1} does not guarantee that the three conditions are satisfied for all values of *a* ∈ [0, 1/2], so the inequalities are additional conditions. As shown in *SI Text*, *a* → *ρ*_{1}(*a*, *ρ*) is increasing, so the range of possible values of *ρ*_{1} for a fixed value of *ρ* is

The possible self-consistent stationary distributions are similar in the second case but the formulas are different.

**Theorem 2B.** *Suppose ρ* ≥ *ρ*_{c}*. For a* ∈ (0, 1/2] *let**where**a* ∈ [0, 1/2] let*A symmetric distribution* *is self-consistent if and only if it has the form above with parameters* *and*

This time *a* → *ρ*_{1}(*a*, *ρ*) is decreasing, so the range of possible values of *ρ*_{1} for a fixed value of *ρ* is*ρ*_{1} in **4** has become the lower bound. See Fig. 3 for a picture.

## Stability Calculations

The results in this section are proved in *SI Text*. Using large deviations for the trinomial distribution, which in this case is just calculating probabilities using Stirling’s formula, we conclude the following:

**Theorem 3A.** *Suppose ρ* < *ρ*_{c} *and recall μ*_{a} *has no mass on Q*_{1,1}. *The flow into Q*_{0,0} *from Q*_{0,1} *and Q*_{1,0} *is larger than the flow out if and only if***Theorem 3B.** *Suppose ρ* ≥ *ρ*_{c} *and recall* *has no mass on Q*_{0,0}. *The flow out of Q*_{1,1} *to Q*_{0,1} *and Q*_{1,0} *is larger than the flow in if and only if*

## Phase Transition

Combining Theorems 2A and 3A, we can determine the behavior of the process for *ρ* < *ρ*_{c}. The set of possible values for *ρ*_{1}(*a*, *ρ*) for a fixed *ρ* is the interval [*ρ*_{1}(0, *ρ*), *ρ*_{1}(1/2, *ρ*)] given in **4**. Because 0 ≤ *ρ* ≤ *ρ*_{c}, we are looking for a solution to*x*_{0} ∈ [0, 2*ρ*_{c}/(1 + *ϵ*)). In *SI Text*, we show that *x*_{0} exists and is unique. Here, we will concentrate on what happens in the example *ρ*_{c} = 0.2 and *ϵ* = 0.1. When *ρ*_{c} = 0.2, the interval is [0, 0.4/1.1], and we have *x*_{0} = 0.2183.

Let *ρ*_{b} be chosen so that *x*_{0} = *ρ*_{1}(1/2, *ρ*_{b}) and *ρ*_{d} be chosen so that *x*_{0} = *ρ*_{1}(0, *ρ*_{d}). See Fig. 3 for a picture. When a solution *x*_{0} exists in the desired interval*ρ*_{b} = 0.1201 and *ρ*_{d} = 0.1825.

**Theorem 4A.** *The stable stationary distributions for ρ* < *ρ*_{c} *are**ρ* < *ρ*_{b}, *ρ*_{0} > *ρ*_{1}(1/2, *ρ*), the flow into *Q*_{0,0} is always larger than the flow out, so *μ*_{r} is the stationary distribution. When *ρ*_{b} < *ρ* < *ρ*_{d}, there will be an *a*_{c} ∈ (0, 1/2) so that *ρ*_{1}(*a*_{c}, *ρ*) = *ρ*_{0}. The flow into *Q*_{0,0} is larger than the flow out when *a* < *a*_{c} and the *a* in the mixture will decrease, whereas for *a* > *a*_{c}, the flow out of *Q*_{0,0} will be larger than the flow in and *a* will increase. Thus, we have stable fixed points at 0 and 1/2. When *ρ*_{d} < *ρ* < *ρ*_{c}, *ρ* < *ρ*_{1} (0, *ρ*), the flow out of *Q*_{0,0} is always larger than the flow in, and the segregated fixed point with *a* = 1/2 is the stationary distribution.

Using Theorems 2B and 3B, we can determine the behavior of the process for *ρ* ≥ *ρ*_{c}. The set of possible values for *ρ* is the interval **6**. Because *ρ*_{c} ≤ *ρ* ≤ 0.5, we are looking for a solution to*SI Text*, we show that there is a solution in the desired interval if and only if **8** at the right end point 1/(1 + *ϵ*). When *ϵ* = 0.1, the condition is *ρ*_{c} < 0.25964.

When *ρ*_{c} = 0.2 and *ϵ* = 0.1, this interval is [0.4/1.1, 1/1.1], and

**Theorem 4B.** *The stable stationary distributions for ρ* ≥ *ρ*_{c} *are**ρ*_{c} when *ϵ* = 0.1.

## Discussion

Here, we considered a metapopulation version of Schelling’s model, which we believe is a better model for studying the dynamics of segregation in a city than a nearest neighborhood interaction on the 2D square lattice. Due to the simple structure of the model, we are able to describe the phase transition in great detail. For *ρ* < *ρ*_{b}, a random distribution of families *μ*_{r} = Tri(*ρ*, *ρ*) is the unique stationary distribution. As *ρ* increases there is a discontinuous phase transition to a segregated state, *μ*_{s} at *ρ*_{d} preceded by an interval (*ρ*_{b}, *ρ*_{d}), in which both *μ*_{r} and *μ*_{s} are stable. Surprisingly the phase transition occurs to a segregated state occurs at a value *ρ*_{d} < *ρ*_{c}, i.e., at a point where in a random distribution most families are happy. This shift in behavior occurs because random fluctuations create segregated neighborhoods, which, as our analysis shows, are more stable than the random ones.

If *ρ*_{c} is small enough, then as *ρ* nears 1/2, there is another discontinuous transition at *ρ*_{c} = 0.2 and *ϵ* = 0.1, the fraction of vacant houses at

The results in this paper have been derived under two assumptions: (*i*) stationary distributions are invariant under interchange of red and blue and (*ii*) the process is always in one of a one-parameter family of self-consistent stationary distributions indexed by *a* ∈ [0,1/2], but the value of *a* changes over time. We are confident that *ii* can be rigorously proved. Removing *i* will be more difficult, because when symmetry is dropped, there is a two-parameter family of self-consistent distributions. A more interesting problem, which is important for applications to real cities, is to allow the initial densities of reds and blues and their threshold for happiness to differ. Although our solution is not yet complete, we believe it is an important first step in obtaining a detailed understanding of the equilibrium behavior of Schelling’s model in a situation that is relevant for applications.

## Acknowledgments

We thank David Aldous, Nicholas Lanchier, Simon Levin, and Thomas Liggett for helpful comments. R.D. and Y.Z. were partially supported by Grants DMS 10-05470 and DMS13-05997 from the probability program at the National Science Foundation.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: rtd{at}math.duke.edu.

Author contributions: R.D. and Y.Z. performed research and wrote the paper.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1414915111/-/DCSupplemental.

## References

- ↵
- ↵
- Schelling TC

- ↵
- Clark WAV,
- Fossett M

- ↵
- ↵
- ↵
- Kandler A,
- Perreault C,
- Steele J

*Adv Complex Syst*15:1203001. - ↵
- Hatna E,
- Benenson I

- ↵
- Vinkovic D,
- Kirman A

- ↵
- ↵
- Singh A,
- Vainchtein D,
- Weiss H

- ↵
- Dall’Asta L,
- Castellano C,
- Marsili M

*J Stat Mechanics: Theory and Experiment*7:L07002. - ↵
- ↵
- Rogers T,
- McKane AJ

*J Stat Mechanics: Theory and Experiment*7:P07006. - ↵
- ↵
- Brandt C,
- Immorlica N,
- Kamath G,
- Kleinberg R

- ↵
- Grauwin S,
- Bertin E,
- Lemoy R,
- Jensen P

- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Applied Mathematics

- Social Sciences
- Social Sciences