On the universal structure of human lexical semantics
- aInstitute for New Economic Thinking at the Oxford Martin School, Oxford OX2 6ED, United Kingdom;
- bMathematical Institute, University of Oxford, Oxford OX2 6GG, United Kingdom;
- cSanta Fe Institute, Santa Fe, NM 87501;
- dAmerican Studies Research Institute, Indiana University, Bloomington, IN 47405;
- eEarth-Life Sciences Institute, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan;
- fRonin Institute, Montclair, NJ 07043;
- gDepartment of Linguistics, University of New Mexico, Albuquerque, NM 87131;
- hDepartment of Linguistics, University of California, Berkeley, CA 94720;
- iMS B285, Grp T-2, Los Alamos National Laboratory, Los Alamos, NM 87545
See allHide authors and affiliations
Edited by E. Anne Cutler, University of Western Sydney, Penrith South, New South Wales, and approved December 14, 2015 (received for review October 23, 2015)

Significance
Semantics, or meaning expressed through language, provides indirect access to an underlying level of conceptual structure. To what degree this conceptual structure is universal or is due to properties of cultural histories, or to the environment inhabited by a speech community, is still controversial. Meaning is notoriously difficult to measure, let alone parameterize, for quantitative comparative studies. Using cross-linguistic dictionaries across languages carefully selected as an unbiased sample reflecting the diversity of human languages, we provide an empirical measure of semantic relatedness between concepts. Our analysis uncovers a universal structure underlying the sampled vocabulary across language groups independent of their phylogenetic relations, their speakers’ culture, and geographic environment.
Abstract
How universal is human conceptual structure? The way concepts are organized in the human brain may reflect distinct features of cultural, historical, and environmental background in addition to properties universal to human cognition. Semantics, or meaning expressed through language, provides indirect access to the underlying conceptual structure, but meaning is notoriously difficult to measure, let alone parameterize. Here, we provide an empirical measure of semantic proximity between concepts using cross-linguistic dictionaries to translate words to and from languages carefully selected to be representative of worldwide diversity. These translations reveal cases where a particular language uses a single “polysemous” word to express multiple concepts that another language represents using distinct words. We use the frequency of such polysemies linking two concepts as a measure of their semantic proximity and represent the pattern of these linkages by a weighted network. This network is highly structured: Certain concepts are far more prone to polysemy than others, and naturally interpretable clusters of closely related concepts emerge. Statistical analysis of the polysemies observed in a subset of the basic vocabulary shows that these structural properties are consistent across different language groups, and largely independent of geography, environment, and the presence or absence of a literary tradition. The methods developed here can be applied to any semantic domain to reveal the extent to which its conceptual structure is, similarly, a universal attribute of human cognition and language use.
Footnotes
- ↵1To whom correspondence may be addressed. Email: visang{at}santafe.edu or tanmoy{at}lanl.gov.
Author contributions: H.Y., E.S., C.M., J.F.W., W.C., and T.B. designed research; H.Y., L.S., E.S., C.M., J.F.W., I.M., and T.B. performed research; L.S. and W.C. collected the data; H.Y., E.S., C.M., J.F.W., I.M., W.C., and T.B. analyzed data; and H.Y., E.S., C.M., W.C., and T.B. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1520752113/-/DCSupplemental.
Freely available online through the PNAS open access option.
http://www.pnas.org/preview_site/misc/userlicense.xhtml