New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
A comparison of worldwide phonemic and genetic variation in human populations
Contributed by Marcus W. Feldman, December 17, 2014 (sent for review July 16, 2014; reviewed by Quentin D. Atkinson and Keith Hunley)

Significance
Linguistic data are often combined with genetic data to frame inferences about human population history. However, little is known about whether human demographic history generates patterns in linguistic data that are similar to those found in genetic data at a global scale. Here, we analyze the largest available datasets of both phonemes and genotyped populations. Similar axes of human geographic differentiation can be inferred from genetic data and phoneme inventories; however, geographic isolation does not necessarily lead to the loss of phonemes. Our results show that migration within geographic regions shapes phoneme evolution, although human expansion out of Africa has not left a strong signature on phonemes.
Abstract
Worldwide patterns of genetic variation are driven by human demographic history. Here, we test whether this demographic history has left similar signatures on phonemes—sound units that distinguish meaning between words in languages—to those it has left on genes. We analyze, jointly and in parallel, phoneme inventories from 2,082 worldwide languages and microsatellite polymorphisms from 246 worldwide populations. On a global scale, both genetic distance and phonemic distance between populations are significantly correlated with geographic distance. Geographically close language pairs share significantly more phonemes than distant language pairs, whether or not the languages are closely related. The regional geographic axes of greatest phonemic differentiation correspond to axes of genetic differentiation, suggesting that there is a relationship between human dispersal and linguistic variation. However, the geographic distribution of phoneme inventory sizes does not follow the predictions of a serial founder effect during human expansion out of Africa. Furthermore, although geographically isolated populations lose genetic diversity via genetic drift, phonemes are not subject to drift in the same way: within a given geographic radius, languages that are relatively isolated exhibit more variance in number of phonemes than languages with many neighbors. This finding suggests that relatively isolated languages are more susceptible to phonemic change than languages with many neighbors. Within a language family, phoneme evolution along genetic, geographic, or cognate-based linguistic trees predicts similar ancestral phoneme states to those predicted from ancient sources. More genetic sampling could further elucidate the relative roles of vertical and horizontal transmission in phoneme evolution.
Footnotes
- ↵1To whom correspondence may be addressed. Email: mfeldman{at}stanford.edu and sramachandran{at}brown.edu.
This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected in 2013.
Author contributions: M.R., M.W.F., and S.R. conceived of the study; N.C., M.W.F., and S.R. designed research; M.R. developed the Ruhlen database; N.C. and S.R. prepared and analyzed linguistic data; T.J.P. and N.A.R. prepared genetic data; N.C. and T.J.P. analyzed genetic data; N.C. merged linguistic data with the Ethnologue and with genetic data, and conducted phylogenetic analyses; N.C., N.A.R., M.W.F., and S.R. wrote the paper with input from all authors.
Reviewers: Q.D.A., University of Auckland; and K.H., University of New Mexico.
The authors declare no conflict of interest.
Data deposition: Linguistic data from the Ruhlen database analyzed in this paper are available in Datasets S1–S3.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1424033112/-/DCSupplemental.
Freely available online through the PNAS open access option.
Citation Manager Formats
More Articles of This Classification
Biological Sciences
Related Content
Cited by...
- Worldwide distribution of the DCDC2 READ1 regulatory element and its relationship with phoneme variation across languages
- Assessing the relative impact of historical divergence and inter-group transmission on cultural patterns: a method from evolutionary ecology
- Integrative studies of cultural evolution: crossing disciplinary boundaries to produce new insights
- How humans transmit language: horizontal transmission matches word frequencies among peers on Twitter
- Kinship structures create persistent channels for language transmission
- Genetic and linguistic histories in Central Asia inferred using approximate Bayesian computations
- Inferring patterns of folktale diffusion using genomic data
- Fine-Scale Human Population Structure in Southern Africa Reflects Ecogeographic Boundaries
- Can a linguistic serial founder effect originating in Africa explain the worldwide phonemic cline?
- Reassessment of global gene-language coevolution














