## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Spatial embedding of structural similarity in the cerebral cortex

Edited by Robert Desimone, Massachusetts Institute of Technology, Cambridge, MA, and approved October 15, 2014 (received for review July 24, 2014)

## Significance

The cerebral cortex can be divided into a number of distinct areas on the basis of anatomy and function. Understanding the complex pattern of connections among these areas is essential to uncovering how the brain performs its distributed computations. We report a systematic relation between the connectivity and functional similarity of cortical areas in the monkey, human, and mouse cortex. Motivated by observations that the cortical areal network is densely connected and that connections have a strong dependence on wiring length, we introduce a spatially embedded, generative model of the areal network that accounts for many observed features of cortical connectivity.

## Abstract

Recent anatomical tracing studies have yielded substantial amounts of data on the areal connectivity underlying distributed processing in cortex, yet the fundamental principles that govern the large-scale organization of cortex remain unknown. Here we show that functional similarity between areas as defined by the pattern of shared inputs or outputs is a key to understanding the areal network of cortex. In particular, we report a systematic relation in the monkey, human, and mouse cortex between the occurrence of connections from one area to another and their similarity distance. This characteristic relation is rooted in the wiring distance dependence of connections in the brain. We introduce a weighted, spatially embedded random network model that robustly gives rise to this structure, as well as many other spatial and topological properties observed in cortex. These include features that were not accounted for in any previous model, such as the wide range of interareal connection weights. Connections in the model emerge from an underlying distribution of spatially embedded axons, thereby integrating the two scales of cortical connectivity—individual axons and interareal pathways—into a common geometric framework. These results provide insights into the origin of large-scale connectivity in cortex and have important implications for theories of cortical organization.

The cerebral cortex can be divided into a number of distinct areas according to anatomy and function. Understanding the complex pattern of connections among these areas is necessary for elucidating the structural basis of distributed processing in the brain, yet the principles that govern this large-scale organization are not fully understood. In particular, although cortical organization has been characterized extensively within the framework of “complex networks” (1, 2), there are few generative models that explain how the observed features of areal connectivity may arise (3⇓–5).

In contrast to the large and sparsely connected architecture of many networks (including the neuronal network of the brain), cortical areal networks are relatively small and densely connected: the mouse neocortex consists of roughly 40 areas per hemisphere with ∼50% of the possible connections present (6, 7), and the macaque neocortex consists of roughly 100 areas per hemisphere (8) with ∼60% of the connections present (9). In such networks, the properties that define conventional complex networks—the degree distribution (number of areas connected to an area), average path length (smallest number of connected steps between a pair of areas), and clustering (density of connections among areas connected to the same area)—are not very informative. For instance, in a network with overall connection density

Fundamentally, existing complex network models provide an incomplete description of the cortical areal network because they neglect the underlying spatial structure that shapes connectional topology (4, 5, 13, 14). Recent data indicate that interareal pathways consist of spatially heterogeneous axonal projections whose distribution exhibits two striking properties: first, the number of axons that project from one area to another varies over several orders of magnitude across cortex (7, 15), and second, connections between different cortical areas correspond to only a small fraction of all axons, most of which project to within the same area and contribute to the local circuit (9, 16). Thus, a proper description of the cortical network requires integrating interareal pathways and the axons that compose them into a common framework encompassing both global and local structure.

To address these challenges, we present an analysis of cortical connectivity that reveals a key organizing principle common to three mammalian species with widely different brain sizes, namely monkeys, humans, and mice. Our starting point was that connectional similarity of pairs of areas, defined by their shared inputs or outputs (“connectional fingerprints”), reflects the functional organization of cortex (17⇓–19). A major finding of this work is a systematic relation between the occurrence of connections from one area to another and their similarity distance as defined by the amount of shared connections. This relation, which is observed in all datasets we examined, has its basis in the dependence of connections on wiring distance (4). We developed a simple, spatially embedded random network model that gives rise to the structures revealed by our analysis. Notably, to our knowledge, it is the first generative model that relates binary features to the underlying weighted structure through a well-defined geometric coarse-graining.

The model we propose reproduces numerous spatial and topological properties of the macaque cortex. These include graph-theoretic measures such as degree sequence, path length, clustering coefficient, and motif distribution, as well as features that were not accounted for in any previous model, such as the wide range of interareal connection weights. Because our model cortex is embedded in a spatial continuum, moreover, it can be partitioned into regions with an arbitrary spatial resolution. Such an approach makes it possible to study the same axonal network using alternative methods of parcellation, an important issue for modeling data obtained from diffusion-based tractography at varying spatial scales (20). Taken together, our results provide a promising direction for the investigation of the structure and origin of areal connectivity in cortex.

## Results

### Similarity Distance.

We first investigated the binary connection structure of the macaque cortex, the proper understanding of which is critical to developing a theory of its weighted connectivity. The binarized intrahemispheric connection matrix *C*, where *s* to target area *t* and 0 otherwise (Fig. 1*A*), was obtained from recent retrograde tracer injections in 29 representative target areas and labeling in all 91 source areas (9). The connectivity is dense: 62% of the possible interareal connections are present, which is a much higher proportion than in previously available data [e.g., from CoCoMac (17, 21)] as a result of the consistency of the cortical parcellation, comprehensive hemispheric examination, and optimized tract-tracing methods (22).

We used “cosine similarity distance” to quantify the amount of shared outputs between areas *s* and *t* runs through the target areas excluding *s* and

Because the connection matrix in Fig. 1*A* reflects injections into a subset of all cortical areas, it is a priori unclear how reliable the resulting similarity distances are. To ascertain this, we computed the analogously defined input similarity distances, whose “true” values can be determined from Fig. 1*A*, but only between pairs of injected areas. A comparison of these values with those obtained by sampling (*SI Appendix*, Fig. S1) suggests that output similarity distances computed from the sampled target areas are representative of the true distances we would obtain if the full connection matrix were available. Moreover, cosine similarity distance is only one of many possible measures of shared connections, or “structural equivalence” (23, 24), and our results are qualitatively the same for other normalized measures (*SI Appendix*).

### Structural Similarity and the Functional Organization of Cortex.

As a measure of the amount of shared outputs, the similarity distance between two areas indicates their functional similarity. This is revealed in a classical multidimensional scaling (MDS) analysis of “similarity space,” which places areas in a Euclidean space most compatible with the interareal distance relations given by Eq. **1** (Fig. 1*B*). Although we must be cautious in drawing detailed conclusions from such visualizations, they capture the broad functional layout of cortex. For instance, in this map, many sensory and motor areas are grouped by function and located along the periphery, with primary sensorimotor areas at the edge and areas involved in “higher-level” processing closer to the center. The most highly connected—and interconnected—areas are placed at the origin of the MDS map due to the requirement that they be close to many areas. Interestingly, the center of the map also comprises areas from physically distant parts of cortex, showing that the relation between similarity distance and anatomical distance is not trivial [their correlation is significant *A*, *Inset*)].

The definition in Eq. **1** can also be applied to the weighted connection matrix from which the binarized matrix was derived. The result (*SI Appendix*, Fig. S2) is a more clustered map with fewer areas in the center, which can be understood from the fact that the connection weights vary over five orders of magnitude across cortex (15). Thus, the similarity distances and resulting MDS map are dominated by the strongest connections, many of them between nearby areas (19).

We applied the same analysis to a recently published connection matrix of the mouse cortex (6). The known functional modules of the mouse cortex are again grouped in the MDS map of similarity space, and suggest certain homologies with the macaque cortex (Fig. 1*C*; see *SI Appendix*, Fig. S3 for all modules). *SI Appendix*, Fig. S4 presents the corresponding map of human cortical connectivity obtained from diffusion spectrum imaging (DSI) (25), for which we can additionally incorporate interhemispheric connections.

### Similarity Distance and Connectivity.

It was previously observed that similarity distances are correlated with the presence of connections (26), which is intuitively understood from the fact that areas separated by small similarity distances tend to be functionally similar, and functionally similar areas are likely to be connected. Indeed, we found that a systematic relation exists between the occurrence of a connection from one cortical area to another and the similarity distance between them.

We established this in two different ways (Fig. 2*A*, *Left*): first, by binning the similarity distances and computing for each bin the fraction of all possible connections between pairs of areas that are present in the data, and second, by applying binomial regression with a probability of connection—a modified logistic function—that depends on the similarity distance (*SI Appendix*). This relation is a natural generalization of clustering, which quantifies the connectedness of a set of areas having one shared connection (12) but is uninformative in a highly connected network. The results are in strong contrast to an uncorrelated model such as the Erdős–Rényi (ER) random network (red), in which every connection is present with an independent probability that is trivially a constant function of the similarity distance. Furthermore, similarity distances in the macaque cortex have a smooth and broad distribution (*Inset*). This is again in contrast to an ER network, for which the distribution can be calculated theoretically (*SI Appendix*) and is narrowly peaked around the mean because every area looks equally similar to every other area (*Inset*, red).

An interesting feature of the connection matrix of Fig. 1*A*, specifically the edge-complete subnetwork for which both inputs and outputs are known, is that many areal pairs are not reciprocally connected (9). If we assume that connections between two areas with similarity distance *d* occur with independent probability *A* (*Left*) can be used to predict the occurrence of unidirectional *Right*). Moreover, although we have focused on output similarity distances to include all 91 areas in the analysis, Fig. 2*A* also holds if input similarity distances are used instead (Fig. 2*B*).

These results are not unique to the macaque cortex. In data from human DSI (25), the relation between connectivity and similarity distance is qualitatively similar (Fig. 2*C*). Although the distributions of similarity distances differ slightly (*Inset*), the discrepancy can be largely accounted for by first, symmetrizing the edge-complete subnetwork of the macaque connection matrix to account for the absence of directional information in DSI, and second, removing its weakest connections to match the overall density of connections in the human data. Here the weakest connections were defined as those with the smallest number of neurons labeled in the retrograde tracer study of the macaque cortex. In contrast, removing randomly chosen connections leads to a profile similar to that of the ER network in Fig. 2*A* (red curves).

The corresponding analysis for a connection matrix derived from the CoCoMac database of macaque connectivity (17, 21) likewise shows that discrepancies are attributable to the absence of weak, long-distance connections (Fig. 2*D*) (19, 22). In this case, the distribution of similarity distances is narrowly peaked near *A*, *Inset*), human (Fig. 2*C*, *Inset*), and mouse (Fig. 2*E*, *Inset*) data. Comparable relations between the proportion of connections and similarity distance also are found in the mouse cortex (Fig. 2*E*); here the availability of the full, directed connection matrix confirms that inputs and outputs have similar statistics (see also *SI Appendix*, Fig. S5), which is somewhat surprising given the presence of many nonreciprocally connected pairs of areas.

The physical origin of the relation between connectivity and similarity distance may be partly inferred by noting that the occurrence of connections has, qualitatively, the same dependence on interareal, center-to-center wiring distance (Fig. 3*A*) as it does on similarity distance (Fig. 3*C*). This is somewhat obscured in the data in part due to the difficulty of estimating wiring distances through the white matter (as opposed to Euclidean distances between the centers), particularly for spatially distant areas. However, the simultaneous presence of both relations—which is also the case in other types of spatial networks (*SI Appendix*, Fig. S6)—suggests that the dependence on similarity distance is intuitively related to the dependence on wiring distance (Fig. 3*A*, *Inset*): cortical areas tend to form connections with spatially nearby areas, so two spatially nearby areas are likely to form connections with the same areas including each other.

### Generative Model.

Motivated by the results of our data analysis, we developed a simple, spatially embedded model of weighted cortical networks. Here we describe the minimal version of this model, whose specification requires only six parameters.

Our starting point was the observation that the overall distribution of interareal wiring distances is well described by the distribution of distances in a spheroid (i.e., a rugby ball) with radii *B*). We therefore approximated this distribution in the model by randomly placing the “centers” of the *N* areas at positions *F*, *Left Inset*). Cortical area *i* is then defined as all points in the spheroid closer to

For each axon generated as described below, its source and target areas are determined by the axon’s initial and terminal coordinates (Fig. 3*F*). The initial positions of axons are sampled uniformly in the spheroid. Inspired by the growth of real axons that are guided by a variety of attractive and repulsive molecular cues (29, 30), we assume that each area exerts a distance-dependent attractive “force,” and the trajectory of each axon is determined by the sum of these forces. The effect on an axon originating at **2**, while the length of the axon, *F*, *Right Inset*). The axon terminates at position **2** (specifically, the exponent β) and the exponential distribution of axon lengths is that most axons terminate in the same area from which they originate, capturing the predominantly local character of axonal projections observed in the macaque data (15, 16). The process is not strongly affected by small, “noisy” rotations of

We emphasize that the generation of axons as described here is not dynamic: both the direction and length of an axon are calculated once, and in particular, the forces that determine the direction of axon growth are evaluated only at the initial position of the axon. For the simple cortical manifold used here, this effective mechanism was sufficient to reproduce many features of the data while retaining the advantage of being computationally fast. We expect that a dynamical growth model will be necessary in future studies that incorporate a more realistic cortical geometry.

### Comparison of Model and Data.

Fig. 3 shows that a realization of the model closely reproduces the macaque data. The number of areas in the model network is the same as the parcellation of Fig. 1*A*, *SI Appendix*). The remaining three parameters—

The variability in the gross statistical properties of the resulting networks is generally small: the overall proportion of connections, for instance, was

The model also reproduces many common measures of network structure (1, 2), such as the in- and out-degree sequences (degree in descending order, normalized by 91 sources and 29 targets, respectively) and clustering coefficients (Fig. 5 *A* and *B*, *Left*; see *SI Appendix* for the precise definition). The model captures the wide range in the number of axons that connect different areas, which can be compared with data using the fraction of labeled neurons (FLN) (Fig. 5 *A* and *B*, *Right*). FLNs, which are derived from the number of connections and represent normalized information “bandwidth” rather than actual (synaptic) strengths, are defined relative to a target area as the fraction of axons originating from each external area (15); they span approximately five orders of magnitude in both model and data. Finally, the distribution of three-area motifs (32) is well described by the model (Fig. 5*C*).

One indication that the proposed model applies to the mammalian cortex in general rather than being specific to the macaque cortex is that the macaque connectivity data can be “matched” to the mouse connectivity data systematically. To illustrate this, we observe that the small discrepancy between data and model for the triad distribution in Fig. 5*C* can be largely eliminated (Pearson’s *A*) by matching the density of connections in the edge-complete subnetwork of the macaque connection matrix to the model density. As before, this is done by removing the weakest connections. Likewise, matching the overall connection density of the macaque to the mouse results in triad distributions that are strikingly similar (Pearson’s *B*). Matching the model network of Fig. 3 to the mouse connection matrix thus gives similar results (Pearson’s *C*), suggesting that weak, mostly long-distance connections present in the primate cortex but absent in the rodent cortex may play a role in shaping functional differences between the two species.

## Discussion

Our analysis of the macaque, human, and mouse cortical areal networks suggests that there is a common logic to the large-scale organization of cortex. Many aspects of this structure are captured by a simple, generative random network model in which interareal connectivity emerges as a coarse-grained description of heterogeneously distributed, spatially embedded axons. The model highlights the need for caution in interpreting certain features of the cortical network: many such features are surprising only if our expectations are based on inappropriate null models (33). For instance, motif distributions observed in the macaque and mouse cortex are reproduced easily with our random network model, indicating that they do not necessarily reflect specialized computational roles.

Previous modeling studies have used connection rules with a distance dependence at the level of areas (3, 5). However, the distance traveled by an axon between two cortical areas may be substantially different depending on where in the respective areas the axon originates and terminates. Thus, it may be more realistic to assume an effective distance distribution at the level of axons. Our model uses an “exponential distance rule” (EDR) for the length of spatially embedded axons, which has strong empirical support and was already shown to describe many properties of the macaque cortex (4). We have demonstrated that a model incorporating EDR into a fully geometric framework generates, in a natural way, the quantitative relation between interareal connectivity and similarity distance (Fig. 2*A*), and between interareal connectivity and center-to-center wiring distance (Fig. 3*A*). Furthermore, this geometric approach has the advantage that once the axons are generated, the cortex can be divided into any number of areas with arbitrary spatial resolution, independent of the parcellation that generated the connection matrix.

The fundamental advantage of using a generative model of connectivity is that one can evaluate the evidence for competing models and use Bayesian model comparison to ask which aspects of the model are important for explaining empirical observations (28). For example, it might be that one, two, or four dimensions for the embedding manifold provide a better explanation for the observed distributions, which could be evaluated by using the probability of empirical distributions (say, of similarity distances) under each model. After optimizing the model parameters using their ML values, the resulting likelihood may be used as a proxy for model evidence. Future work will investigate the use of Bayesian model comparison to optimize formal aspects of the model and address related questions.

Elaborations of the minimal model will also allow future work to incorporate additional properties of the cortical network. For instance, the uniform sampling of the initial positions of axons might be modulated to better reflect the heterogeneity of connection patterns within areas (9). The model cortex might expand during the axon generation process to mimic simplified notions of growth and development. The distribution of axon lengths can depend on the total cue strength, and there can be multiple cues with both repulsive and attractive interactions (34), allowing for greater specificity of connections. The distinction between superficial- and deep-layer axons, which are essentially geometric properties and are strongly related to whether the axons are feedforward or feedback in nature (35), can also be included in the model. Realistic cortical geometries are clearly important. Together with dynamical axon growth and the inclusion of subcortical structures (particularly thalamus), such extensions would allow the model to more faithfully capture the process of cortical wiring as it occurs in the brain.

## Conclusion

The present work highlights functional similarity represented by shared connections as an important principle for understanding the large-scale structural organization of the mammalian cortex, and demonstrates that simple generative principles respecting the spatially embedded nature of cortex can account for numerous features of cortical connectivity. The dynamical and functional implications of this structure remain important questions for future investigation.

## Materials and Methods

All data are available from the references given in the main text; see *SI Appendix* for further details. *SI Appendix* additionally describes the ML estimation procedure used to establish the relation between the proportion of connections and similarity (wiring) distance, and the derivation of the distribution of similarity distances in ER networks. Alternative definitions of similarity distance, formulas for the distribution of distances in the spheroid and ellipsoid, and the definition of clustering coefficient are also summarized in *SI Appendix*.

## Acknowledgments

We thank N. T. Markov, K. Knoblauch, W. J. Ma, R. Chaudhuri, and J. D. Murray for valuable discussions. H.F.S. and X.-J.W. were supported by Office of Naval Research Grant N00014-13-1-0297, NIH Grant MH062349, and the Swartz Foundation. H.K. was supported by ANR-11-BSV4-501 and LabEx CORTEX (ANR-11-LABX-0042) of Université de Lyon, within the program “Investissements d’Avenir” (ANR-11-IDEX-0007) operated by the French National Research Agency (ANR).

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: xjwang{at}nyu.edu.

Author contributions: H.F.S. and X.-J.W. designed research; H.F.S. performed research; H.K. analyzed data; and H.F.S., H.K., and X.-J.W. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1414153111/-/DCSupplemental.

## References

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Van Essen DC,
- Glasser MF,
- Dierker DL,
- Harwell J

- ↵.
- Markov NT, et al.

- ↵.
- Markov NT, et al.

- ↵.
- Barabási AL,
- Albert R

- ↵
- ↵
- ↵
- ↵.
- Markov NT, et al.

- ↵
- ↵.
- Young MP

- ↵
- ↵.
- Markov NT, et al.

- ↵.
- Mars RB, et al.

- ↵
- ↵
- ↵.
- Cox T,
- Cox MAA

- ↵
- ↵
- ↵
- ↵.
- Okabe A,
- Boots B,
- Sugihara K,
- Chiu SN

- ↵
- ↵
- ↵
- ↵.
- Braitenberg V,
- Schüz A

- ↵.
- Milo R, et al.

- ↵.
- Artzy-Randrup Y,
- Fleishman SJ,
- Ben-Tal N,
- Stone L

- ↵
- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Biological Sciences
- Neuroscience