Selection against variants in the genome associated with educational attainment
- adeCODE genetics/Amgen Inc., Reykjavik 101, Iceland;
- bSchool of Engineering and Natural Sciences, University of Iceland, Reykjavik 101, Iceland;
- cWellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom;
- dDepartment of Applied Economics, Erasmus School of Applied Economics, Erasmus University Rotterdam, 3062 PA Rotterdam, The Netherlands;
- eInstitute for Behavior and Biology, Erasmus University Rotterdam, 3062 PA Rotterdam, The Netherlands;
- fDepartment of Anthropology, University of Iceland, Reykjavik 101, Iceland;
- gFaculty of Medicine, University of Iceland, Reykjavik 101, Iceland
See allHide authors and affiliations
Edited by Andrew G. Clark, Cornell University, Ithaca, NY, and approved December 5, 2016 (received for review July 22, 2016)

Significance
Epidemiological studies suggest that educational attainment is affected by genetic variants. Results from recent genetic studies allow us to construct a score from a person’s genotypes that captures a portion of this genetic component. Using data from Iceland that include a substantial fraction of the population we show that individuals with high scores tend to have fewer children, mainly because they have children later in life. Consequently, the average score has been decreasing over time in the population. The rate of decrease is small per generation but marked on an evolutionary timescale. Another important observation is that the association between the score and fertility remains highly significant after adjusting for the educational attainment of the individuals.
Abstract
Epidemiological and genetic association studies show that genetics play an important role in the attainment of education. Here, we investigate the effect of this genetic component on the reproductive history of 109,120 Icelanders and the consequent impact on the gene pool over time. We show that an educational attainment polygenic score, POLYEDU, constructed from results of a recent study is associated with delayed reproduction (P < 10−100) and fewer children overall. The effect is stronger for women and remains highly significant after adjusting for educational attainment. Based on 129,808 Icelanders born between 1910 and 1990, we find that the average POLYEDU has been declining at a rate of ∼0.010 standard units per decade, which is substantial on an evolutionary timescale. Most importantly, because POLYEDU only captures a fraction of the overall underlying genetic component the latter could be declining at a rate that is two to three times faster.
Epidemiological studies have estimated that the genetic component of educational attainment can account for as much as 40% of the trait variance (1). Recent meta-analyses (2, 3) yielded sequence variants contributing to the underlying genetic component. A negative correlation between educational attainment and number of children has been observed in many populations (4⇓⇓–7). A recent study of ∼20,000 genotyped Americans born between 1931 and 1953 provided direct evidence that the genetic propensity for educational attainment is associated with reduced fertility (8, 9), supporting previously postulated notions (10) that the population average of the genetic propensity for educational attainment and related traits must be declining. Here, using a population-wide sample that is both much larger and covers a substantially greater time span, and with additional auxiliary information, we aim to estimate the change of the genetic propensity of educational attainment in the Icelandic population over the last few decades, starting with an in-depth investigation of the relationship between a measurable genetic component of educational attainment and various aspects of reproduction (11⇓⇓–14).
Results
The number of living Icelanders is ∼317,000 (Fig. S1). A genealogical database of Icelanders (15⇓–17) that is very close to complete for individuals born after 1910 (Materials and Methods) is used in this study. Probands used for the genetic analyses here are limited to those with both parents and all four grandparents listed in the genealogy. For the fertility studies, only children who survived their first year are counted. The first step was to use results from a recent genome-wide association study (GWAS) of educational attainment (3) to determine the per-locus allele-specific weightings of 620,000 markers used to calculate a polygenic score (18, 19), POLYEDU (Materials and Methods for details on polygenic score construction). After excluding the Icelandic cohorts in the GWAS to avoid confounding, 278,948 samples from 62 cohorts were used to determine the weightings for POLYEDU. We computed POLYEDU for over 150,000 Icelanders who were directly genotyped with chip arrays and imputed for additional sequence variants discovered through whole-genome sequencing of 8,453 Icelanders (20) (Materials and Methods). POLYEDU was scaled to an SD of 1, hereafter referred to as standard units (SUs). When applied to 46,079 Icelanders with educational attainment data POLYEDU was found to explain 3.74% of the trait variance (P < 10−300). By contrast, the strongest single variant only explains 0.10% of the variance, indicating that educational attainment is a complex trait influenced by many variants in the genome and highlighting the increased power of using the polygenic score for our analyses. Our first analysis focused on 109,120 individuals (58,560 females and 50,560 males) with year of birth (yob) between 1910 and 1975 (Fig. S2). The genealogical database was used to obtain the number of children (NC) and, where applicable, the age at first child (AGFC) and the average age at child birth (AACB) for this set. The estimated effects of POLYEDU on these reproductive traits, adjusted for yob and 20 principal components (21), are presented in Table 1 for females and males separately. For females, an increase of 1 SU of POLYEDU corresponds to an average decrease of 0.084 children [P = 1.0 × 10−43, calculated with genomic control adjustment (22)], and for those with children AGFC and AACB increased by 0.59 years (P = 5.3 × 10−155) and 0.46 years (P = 1.0 × 10−117), respectively. A similar, albeit weaker, pattern of results was observed for males. The finding of a substantially stronger association for AGFC than NC suggests that the effect of POLYEDU on NC is mainly manifested through delayed reproduction. Thus, for females with children, the association between AGFC and POLYEDU remains highly significant (P = 2.9 × 10−118) after adjusting for NC, whereas the association between NC and POLYEDU is not significant (P = 0.17) after adjusting for AGFC. This led us to examine the effect of POLYEDU on NC[x], the number of children a proband had at or after age x, as a function of x. The results are presented in Fig. 1. At x = 14, the estimated effect on NC[x] per SU of POLYEDU, denoted by eff[x], is −0.084 for females and −0.054 for males. These correspond to results in Table 1 because none of the probands here had children before 14 years of age. As x increases, the estimated effect becomes less negative and is essentially zero at 22 for females and 23 for males. In other words, if children born to mothers at 21 years of age or younger (18% of all children counted here, Fig. S3) and children born to males at 22 or younger (13% of all children counted here) are ignored, there is no correlation between NC and POLYEDU. As x increases further, eff[x] becomes positive and continues to increase until x = 30 for females and starts to drop slowly to zero after that. Note that the difference eff[x] − eff[x + 1] corresponds to the estimated effect of POLYEDU on children born to the proband at precisely age x. Thus, for age x > 30, females with higher POLYEDU tend to have more children than those with lower POLYEDU, whereas the reverse is true for x < 30. Having more children after 30 (P < 1 × 10−15) compensates for having fewer children between 22 and 30 years of age but does not compensate for the reduced number of children at age 21 years and younger. Similar results apply to the males with the age boundaries shifting 1 to 2 years upward. The negative effect of POLYEDU on NC is less for males than for females, and the difference is mainly accounted for by children born to them at 19 years or younger. The analyses performed using POLYEDU maximize statistical power, but the effects on fertility traits can also be seen with individual variants. Results for 120 SNPs that are genome-wide significant (P < 5 × 10−8) in the meta-analysis for educational attainment excluding Icelandic data (Materials and Methods) are given in Table S1 and Figs. S4 and S5. For example, 35 of the 120 SNPs have associations with AGFC of females that are in the same direction and nominally significant (one-sided P < 0.05). The minor allele of one of these SNPs, rs192818565, is associated with reduced education. It is known to tag the H2 haplotype of a common inversion on chromosome 17 that was shown to exhibit characteristics consistent with having been positive-selected (23). It has subsequently been shown that H2 is also associated with reduced intracranial volume (24, 25) and neuroticism (26). Combining our male and female data, the minor allele of rs192818565 is significantly associated with more children (P = 5.2 × 10−3) and having children earlier (P = 2.2 × 10−3). This is thus a striking case where a variant associated with a phenotype typically regarded as unfavorable could nonetheless be also associated with increased “fitness” in the evolutionary sense.
Estimated effects of POLYEDU on fertility traits
Effect of POLYEDU on number of children with lower bound for age. Blue, males; red, females; error bars indicate plus/minus 1 SE. Estimated effect calculated by only counting children born to the proband at or after a certain age (the x axis).
Number of living Icelanders by year.
Total number of Icelanders and number in our fertility study by birth years.
Distribution of age of child birth. For our fertility study, this shows the percentage of children born to the parent at a specific age of the (A) father and (B) mother.
Associations between 120 genome-wide significant markers and three reproductive traits
Associations between 120 genome-wide significant SNPs and three reproductive traits for females. x axis: Zmetaedu = z-score from the educational attainment meta-analysis. y axes: z-scores of associations between each of the variants and the three reproductive traits. ±1.645 correspond to one-sided P = 0.05.
Associations between 120 genome-wide significant SNPs and three reproductive traits for males. Labels as in Fig. S4.
Among the genotyped individuals with yob between 1910 and 1975, information about educational attainment is available for 25,794 females and 19,903 males. For these individuals, the effects of POLYEDU and educational attainment (EDU) itself on the reproductive traits were estimated individually, through separate regressions, and jointly, through regressions including both as predictors (Table 2). We coded EDU as in a recent meta-analysis (3). Individuals fall into four categories: 10, 13, 15, and 20 years (mean = 14.0 and SD = 3.4 for males and mean = 13.4 and SD = 3.7 for females). The first category corresponds to the mandatory minimum education in Iceland and the last corresponds to a college degree. For females, when analyzed separately, each SU increase of POLYEDU decreases expected NC by 0.097 (P = 1.7 × 10−23), whereas each year increase in EDU corresponds to a reduction of 0.045 (P = 5.0 × 10−56). When analyzed jointly, the estimated effect of POLYEDU on NC adjusted for EDU reduces to −0.071, a shrinkage that is meaningful but not drastic, and remains highly significant (P = 7.2 × 10−13). Similar results were observed for AGFC and AACB. Clearly, EDU here is not a complete measure of educational attainment (e.g., it does not include information on postcollege education). With a more comprehensive measure of educational attainment, the estimated effects for POLYEDU upon adjustment might shrink further, but the changes are unlikely to be drastic. For example, limiting to females with 10 years of education (n = 11,055), the estimated effect of POLYEDU on NC is −0.079 (P = 5.8 × 10−6) (Table S2). These results indicate that POLYEDU has a direct effect on reproduction that is independent of the amount of education that is actually attained. Crucially, these results indicate that the magnitude of selection acting on the underlying genetic component of educational attainment has to be estimated directly using genotype data and could be severely underestimated if one attempts to deduce it based solely on the observed negative correlation between educational attainment and fertility. For males, the results tend to be similar to those of the females, only weaker. There is one striking exception. High EDU, similar to having a high POLYEDU, delays reproduction. However, high EDU, unlike high POLYEDU, does not lead to having fewer children for males (27). Indeed, in the joint analysis, the estimated effect of POLYEDU is 0.061 fewer children (P = 2.5 × 10−7), whereas the estimated effect per year of EDU is 0.011 children more. This again highlights that the effect of POLYEDU on reproduction is not simply manifested through educational attainment.
Estimated effects of POLYEDU and EDU on fertility traits
Distributions of educational attainment for males and females. The first panel includes all samples studied. The second and third panels show, for males and females, respectively, how distributions of educational attainment change over time.
Associations between POLYEDU and three reproductive traits stratified by four EDU categories
For 129,808 genotyped individuals born between 1910 and 1990 POLYEDU shows a notable and highly significant decline with yob (−0.0182 SU per decade, P = 5.8 × 10−35). Average polygenic scores calculated for 10-year bins are displayed in Fig. 2. The relationship between POLYEDU and yob exhibits nonlinear behavior (i.e., the downward slope seems to be steeper in the earlier years). When a quadratic fit was performed (blue line), the quadratic term of yob is significant (P = 1.7 × 10−3). A closer examination suggests that the nonlinear behavior mainly reflects a survival effect rather than a birth cohort effect. The samples studied here were collected between 1998 and 2014, with a majority (68%) ascertained before 2006. For 85,520 of the latter, survival data at 2016 are available. The death rate overall is 19.4% (16,610/85,520) and is 54.5% (13,954/25,610) for those with yob before 1940, compared with 4.4% (2,656/59,910) for those with yob ≥ 1940. After adjustment for sex, yob, and age at ascertainment, each SU of POLYEDU is estimated to increase the odds of survival by a factor of 1.083 (P = 2.5 × 10−11). The positive effect of POLYEDU on survival is not surprising because it is significantly associated with many other behavioral and health-related traits in Iceland. For example, POLYEDU is positively correlated with high-density lipoprotein levels, and negatively correlated with triglyceride levels, body mass index, glucose fasting levels, and amount of smoking (P < 1 × 10−30 for each of these five quantitative traits; Table S3). Because POLYEDU has a substantial impact on lifespan, when the samples were ascertained, there would be a positive ascertainment bias, particularly with those born before 1940, for those with high polygenic scores due to the greater likelihood to be alive at the time of ascertainment than those with low polygenic scores. This survival effect has a real impact on the difference in POLYEDU between the young and the old in the population at any given time. However, for the purpose of estimating the change of the average polygenic score over time with respect to birth cohorts, this can be a source of bias. This bias is expected to be small for individuals with yob ≥1940. Using the latter, the estimated rate of decline of the average polygenic score is −0.0122 SU per decade (P = 2.4 × 10−7, SE = 0.0024) (red line in Fig. 2). For comparison, we computed two other polygenic scores based on meta-analyses for height and schizophrenia. The polygenic score for height is not significantly associated with yob (P ≥ 0.5). The polygenic score for schizophrenia is estimated to decline at a rate of −0.0078 SU per decade (P = 1.1 × 10−3, SE = 0.0024) for individuals with yob ≥1940.
Average educational attainment polygenic score and year of birth (yob). Results for 10-year bins are presented. Error bars indicate plus/minus 1 SE. The blue line is a quadratic fit for the full yob range indicated. The red line is a linear fit applied to individuals with yob ≥1940.
Association between POLYEDU and five quantitative traits
An alternative to estimating the rate of decline of POLYEDU is to perform calculations based on the information about reproductive history. If generations were discrete, then the contribution from each parent type (mother/father) to the change of the average polygenic score for the next generation is (eff/2)/(ANC), where eff is the effect of POLYEDU on number of children and ANC is the average number of children. For the females in Table 1, eff = −0.084 and ANC is 2.84, and the estimated contribution to the change per generation is (−0.084/2)/2.84 = −0.015 SU. Given that the average AACB for these females is 27.5 years, this translates to −0.015/27.5 = −0.00054 SU per year, or −0.0054 SU per decade. For the males in Table 1, eff = −0.054, ANC = 2.73, and average AACB = 30.0, translating to an effect of −0.0033 SU per decade. Combining the contributions from females and males gives a change of −0.0087 per decade. This estimate, however, does not take into account that individuals with high POLYEDU tend to have their children later (Table 1), leading to a slower contribution to the generations that follow. After applying equations derived for incorporating the generation time effect (28, 29) (Materials and Methods), the female and male contribution is estimated, respectively, to be −0.0065 and −0.0039 SU per decade, with the sum equal to −0.0104 SU per decade. This estimate is smaller in magnitude than the −0.0122 SU per decade estimate based on the observed decline. However, because the difference is within 1 SE, the two estimates can be considered as consistent.
Although there are challenges to getting a precise estimate of the rate of change of the average POLYEDU value due to nonsampling errors that could be difficult to gauge, with the analyses taken together we consider −0.010 SU per decade to be a reasonable estimate for the period from 1910 and 1990 that is more likely to underestimate than overestimate the true decline. Most importantly, POLYEDU is just a fraction of the full genetic component of educational attainment, which we denote by POLYFULL. It is the rate of change of POLYFULL that is of ultimate interest. Under an assumption that the part of POLYFULL that is not captured by POLYEDU behaves in a similar fashion in its impact on reproduction, the rate of change is proportional to the square root of the variance explained (SI Text). Thus, if POLYFULL is assumed to account for 30% of the variance of EDU, then its estimated rate of change, by extrapolation, is −0.010 × (30/3.74)1/2 = −0.028 SUs per decade. To test the validity of this method of extrapolation we computed a separate polygenic score for educational attainment, denoted by POLY-U.K.B, which was based on the same GWAS results used to construct POLYEDU, except that the contribution from 111,349 UK Biobank samples was removed (Materials and Methods). When we applied POLY-U.K.B to the Icelandic data, it explained 2.52% of the variance of EDU, and the rate of decline estimated based on its effects on reproduction is −0.0085 SU per decade (Materials and Methods). Hence, with the polygenic score strengthening from POLY-U.K.B to POLYEDU, the estimated rate of decline increased by a factor of (0.0104/0.0085) = 1.22, nearly identical to (3.74/2.52)1/2 = 1.22, the square root of the variance explained ratio.
Here we explore the implications of the observed trends on the distributions of cognitive traits in the population. Based on a sample of 1,577 genotyped Icelanders (653 males and 924 females; yob, mean = 1968 and SD = 13 years) with intelligence quotient (IQ) measurements (mean = 102 and SD = 15), each SU of POLYEDU is estimated to increase IQ by 3.8 points (P < 10−20). Given that POLYEDU is estimated to decline at a rate of 0.01 SU per decade, this translates to a decline of 0.038 IQ points per decade. However, under the assumptions that POLYFULL accounts for 30% of the variance of EDU, and the part of POLYFULL that is not captured by POLYEDU behaves in a similar fashion in its impact on both reproduction and IQ, by extrapolation, the decline of POLYFULL would lead to a decline of 0.038 × (30/3.74) = 0.30 IQ points per decade. This would be a very substantial effect if the trend persists for centuries. By contrast, a meta-analysis estimated that IQ scores have increased by 13.8 points between 1932 and 1978, a rate of 3.0 points per decade (30), a phenomenon referred to as the Flynn effect. This rate is 10 times the estimated effect due to the decline of the genetic component, and, more importantly, in the opposite direction. Many commentators [including Flynn himself (31)] consider the Flynn effect to be due to changes in the socioeconomic and technological environment faced by successive generations of humans. Unfortunately, we are unable to assess the Flynn effect in our IQ data, because they were measured within a narrow time interval. Assuming that a similar magnitude of the Flynn effect is found in the Icelandic population, then it is clear that such environmentally induced increases of IQ scores more than compensate for, and indeed mask, any potential decline in the genetic propensity for IQ.
Discussion
From the results presented here it is clear that there has been a slow but steady decline in the frequency of certain variants in the Icelandic gene pool that are associated with educational attainment. It is also clear that education attained does not explain all of the effect. Hence, it seems that the effect is caused by a certain capacity to acquire education that is not always realized. We postulate that, in addition to being correlated with cognitive ability (32, 33), POLYEDU is capturing a portion of the propensity to long-term planning and delayed gratification. To address the question of whether and how these results could be extended to other populations and other time periods it should first be emphasized that the negative selection observed here is likely an example of gene–environment interaction, that is, both the direction of the effect and its magnitude could and would change given a different socioeconomic environment (5, 34, 35). It is likely that in any population where educational attainment is negatively correlated with fertility the underlying genetic propensity would be in decline, but the actual magnitude and characteristics of the decline could vary substantially. Based purely on epidemiological/demographical data, there were concerns about this sort of decline in Great Britain more than eight decades ago (10). However, the possibility that such a phenomenon could be temporary or transitional was also raised (10, 29). Indeed, there might be a cyclical element to this phenomenon, because it is only reasonable to assume that alleles associated with greater educational attainment must have been under positive selection at some time during the evolutionary history of Homo sapiens. The main message here is that the human race is genetically far from being stagnant with respect to one of its most important traits. It is remarkable to report changes in POLYEDU that are measurable across the several decades covered by this study. In evolutionary time, this is a blink of an eye. However, if this trend persists over many centuries, the impact could be profound.
Materials and Methods
Genealogical Database.
For nearly 20 years a genealogical database of Iceland has been used for genetics studies performed by deCODE genetics (15⇓–17). This database is constantly updated. Currently, the deCODE Genetics genealogical database contains essentially all of ∼317,000 living Icelanders (some recent immigrants may not be included in this tabulation) and the vast majority of their ancestors go back to about 1650 and a smaller portion of ancestors before that time. In total, just over 840,000 individuals are presently recorded in the genealogical database, with the earliest recorded yob 740 AD. The database contains information about the yob and sex of each individual, and when available the year of death, the identities of the father and mother, and geographical locations, such as places of birth, residence, and death. The database was constructed from a number of different sources, the most important of which were 14 national censuses spanning the period from 1703 to 1930, parish records from 1780, and the national registry from 1994. Additional key sources include annals, genealogical publications, biographical lists of members of professional associations, and other official records. The database is particularly complete for the probands used in this study, who were all born after 1910. For the vast majority of these individuals, both parents and grandparents are recorded, and all children that survived the first weeks of life.
Sample Collection.
All samples and questionnaire data were collected through studies approved by the National Bioethics Committee and the Icelandic Data Protection Authority. All participants signed informed consent before blood samples were drawn and all data were analyzed under pseudonyms assigned by a third-party encryption system overseen by the Icelandic Data Protection Authority (36).
Meta-Analysis and Polygenic Scores.
In a recent meta-analysis on educational attainment (3) the initial total sample size was 293,724, which included 76,155 samples from 23andMe, and 49,970 Icelandic samples [46,758 from deCODE and 3,212 from Age, Gene/Environment Susceptibility (AGES Reykjavik) Study]. Excluding the Icelandic samples and 23andMe, the remaining sample size was 167,599. When the manuscript was revised for final publication, an additional 111,349 UK Biobank samples were added as replication (full genome association results also available). It is important to note that the meta-analysis produces trait association results for each marker separately (i.e., joint analyses are not performed). When deriving the weights for computing POLYEDU (see below for the method used), for the current study, GWAS results from 23andMe and Iceland were excluded. The 23andMe results were excluded because their policy forbids the release of full GWAS results. The Icelandic results were excluded to avoid confounding/bias and/or overfitting. Thus, the weights for computing POLYEDU were derived based on results from 167,599 + 111,349 = 278,948 samples. Similarly, the weights for POLY-U.K.B were based on 167,599 samples. For the 120 genomewide significant markers, the estimated effects on educational attainment (used in Figs. S4 and S5) did incorporate the 23andMe data and were based on 278,948 + 76,155 = 355,103 samples.
Markers and Methods Used to Compute the Polygenic Score.
The basic method used to process the genotype data for Icelanders, including imputations based on full-genome sequencing results, was described in ref. 20. A framework set of ∼620,000 high-quality SNPs covering the whole genome was used to compute POLYEDU and POLY-U.K.B. Note that a polygenic score is constructed as a linear combination of the genotypes of the markers. In determining the weights used for the linear combination the goal is to maximize the correlation between the resulting score and the trait. This is not a trivial problem in part because, as noted above, the meta-analysis only gives association results for each marker separately, and the markers are in general correlated (i.e., in linkage disequilibrium). We adjusted for linkage disequilibrium using LDpred (19), a recently proposed method. The linkage disequilibrium between markers was estimated using the Icelandic samples. We have explored different ways of constructing the polygenic score (e.g., using a larger set of markers and different ways for adjusting linkage disequilibrium). We found the method used to give close to the best-performing score we could achieve. Most importantly, the main results in this paper are robust to the specific method (as long as it is a reasonable one) used to construct the polygenic score.
Educational Attainment.
As noted above, the deCODE data on educational attainment were part of the published meta-analysis (3). The original Icelandic data were collected through various questionnaires including questions on educational attainment of adults (we used responses from adults 30 years or older assuming maximum educational attainment had been achieved by this age). Responses were then mapped to the International Standard Classification of Education (ISCED) 1997 classification (UNESCO: www.unesco.org/education/information/nfsunesco/doc/isced_1997.htm) format that was also used for the meta-analysis as described in detail in Okbay et al. (3) and briefly also reviewed below. The ISCED 1997 classification includes seven categories of educational attainment that are internationally comparable. The categories are translated into US years-of-schooling equivalents, which have a quantitative interpretation as follows:
0. Preprimary education: 1 year
1. Primary education or first stage of basic education: 7 years
2. Lower secondary or second stage of basic education: 10 years
3. (Upper) secondary education: 13 years
4. Postsecondary nontertiary education: 15 years
5. First stage of tertiary education (not leading directly to an advanced research qualification): 19 years
6. Second stage of tertiary education (leading to an advanced research qualification, e.g., a Ph.D.): 22 years.
In our data, questionnaire responses could be categorized according to the major educational levels in Iceland and were mapped to ISCED 1997 levels according to the mapping schema for Iceland maintained by UNESCO (uis.unesco.org/en/isced-mappings) and accordingly to comparable years of educational attainment in the United States as demonstrated below:
2. Compulsory basic education (10 grades): 10 years
3. (Upper) secondary education or vocational programs: 13 years
4. Postsecondary nontertiary education: 15 years
5–6. Advanced education representing A-levels and/or any university degree: 20 years.
IQ Data.
IQ measurements from population controls were collected in years 2009–2016. Intelligence was measured using the Icelandic version of the Wechsler Abbreviated Scale of Intelligence (WASIIS) (37, 38).
Genomic Control.
Results in this paper are mainly based on regression analyses. The standard output of regressions assumes that the data points are statistically independent. However, because the individuals are genetically related and the trait values of individuals who are genetically closely related tend to be correlated, taking the standard output at face value would tend to produce anticonservative results (i.e., the test statistics tend to have a variance, under the null hypothesis of no effect, that is higher than assumed). Adjusting for 20 principal components reduces, but does not eliminate, this effect. Genomic control is a method that uses the observed results of a large number of SNPs in the genome (1.1 million are used here), most of them expected to have no effect, to evaluate and adjust for the overdispersion of the test statistics. The first paper to describe such an approach is by Devlin and Roeder (39), but the method described there could be somewhat conservative, particularly when many variants in the genome do actually contribute to the trait. The method used here, based on LD score regression (22), is more recent and adjusts for the conservativeness of the original method. Because genomic control is a form of variance adjustment, theoretically it should apply to a polygenic score in the same way as a single marker. This has been confirmed by simulations. For example, applying this method, the t-statistic for the correlation between POLYEDU and AGFC is divided by 1.13 and 1.14 for males and females, respectively. Genomic control was also applied to the correlation between POLYEDU and yob, where the null hypothesis corresponds to a scenario that changes of marker frequencies over time, if any, are a result of random genetic drift. Here, however, no adjustment was found to be necessary; for the analyses restricted to individuals with yob ≥1940, there is actually some indication that the unadjusted results could be slightly conservative. This is probably because whereas values for traits such as EDU tend to be positively correlated between close relatives that is not necessarily the case for yob. We also note that P values given are two-sided unless explicitly stated otherwise.
Determining the Rate of Change of the Polygenic Score As a Result on Its Impact on Fertility Traits.
To derive the (approximate) relationship between the effects of a polygenic score X on the fertility traits and the change of the average polygenic score over time we assume that the effects are linear and small per generation. Specifically, with X standardized to have mean 0 and variance 1, we assume
and
The main mathematical result we are going to show is that, under these assumptions, to the first order, the rate of change of the mean of X per year is
(We note that Eq. 1 might have been explicitly derived in some other publications, although we are not currently aware of it.) In situations where the males and females behave differently, that is, have different values for a, b, c, and d, we have βM for males and βF for females, so that (βM /2) + (βF /2) would be the estimate of the rate of change. Note that the first term in Eq. 1, b/ac, is capturing the contribution of the effect of X on NC to the rate of change, whereas the second term, −dlog(a/2)c−2, is capturing the contribution of the effect of X on AACB.
Before showing how to derive the general form (Eq. 1), we think it is helpful to see how the result can be shown for the special case with d = 0. Here, to the first order, we can assume that mating is performed in discrete generations with generation time c. Let X be the (random) polygenic score for a female in generation t, and scaled to have mean 0 and variance 1. Let Y denote, for a random person in generation t + 1, what is inherited from the mother. It follows that
where w = a + bX. The factor (1/2) results from the fact that only one-half of the genetic material is passed on to the offspring. E(wX)/E(w) corresponds to a weighted average of X with weights proportional to w. [The absolute weight is wt = w/E(w) with expectation 1.] It follows from E(X) = 0 and var(X) = 1 that E(wX) = b and E(w) = a. Thus, E(Y) = (1/2) × (b/a). Taking into account that generation time is c, the contribution of the females to the change of the mean polygenic score per year is (1/2) × (b/ac). The same calculations apply to the fathers.
Deriving the general form (Eq. 1) where the polygenic score also has an effect on generation time (AACB) is more complicated. To do that, we start with equation 6.5 in section 6.3 of ref. 29:
where r is the intrinsic rate of change, R0 is the net reproductive rate, and T is the mean generation time. Because only one-half of the genetic material is transmitted from a parent to an offspring, we should think of R0 as the number of children divided by two. For females, based on the estimated effects of the polygenic score X on number of children and AACB, and assuming linearity, we have
The derivative is
Evaluating at
From equation 6.9 of ref. 29, the relative fitness between two genotypes is
where r1 and r2 are the two intrinsic rates of increase and
Notice that wt is already scaled to have expectation one (approximately). Thus, the weighted average of X, with the weight proportional to fitness, is
Because this is the approximate rate of change per generation, the rate of change per year is
giving us Eq. 1. Here we have shown how to derive Eq. 1 from equations in ref. 29. We note expression Eq. 1 can also be derived using equations from ref. 28.
With POLYEDU, for females a = 2.84, b = −0.084, c = 27.5, and d = 0.46, and for males a = 2.73, b = −0.054, c = 30.0, and d = 0.37. Applying these values to the equation, we get
and
For POLY-U.K.B, for females b = −0.069 and d = 0.39, and for males b = −0.043, and d = 0.31. Similar calculations estimate the expected change to be −0.00085 SU per year.
SI Text
Here we explore the relationship between the rate of decline of POLYEDU and that of POLYFULL, assuming each is standardized to have mean zero and variance one. Decompose POLYFULL as
Acknowledgments
We thank David Cesarini, Philipp Koellinger, and the Social Science Genetic Association Consortium for allowing us early access to genome-wide association study (GWAS) results.
Footnotes
- ↵1To whom correspondence may be addressed. Email: kong{at}decode.is or kari.stefansson{at}decode.is.
Author contributions: A.K. and K.S. designed research; A.K., H.S., A.I.Y., G.A.J., A.O., P.S., G.M., D.F.G., A.H., G.B., and U.T. performed research; A.K., M.L.F., G.T., and F.Z. analyzed data; A.K. derived the mathematical results in Materials and Methods; M.L.F. prepared the figures and tables for publication; H.S. provided IQ data and references; G.A.J. processed the IQ data to a form suitable for analyses; A.I.Y. assisted in deriving the mathematical results in Materials and Methods; A.O. provided meta-analysis results with various cohorts removed; P.S., G.M., and D.F.G. contributed to processing the Icelandic genotype data for analysis; A.H. provided key references and contributed to writing the Discussion; G.B. collected and processed Icelandic education data and provided key references; U.T. oversaw the generation of the genotype data in the laboratory; A.K. wrote the paper; and K.S. contributed to the writing of the final version of the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1612113114/-/DCSupplemental.
Freely available online through the PNAS open access option.
References
- ↵.
- Branigan AR,
- McCallum KJ,
- Freese J
- ↵.
- Rietveld CA, et al., LifeLines Cohort Study
- ↵
- ↵
- ↵
- ↵
- ↵.
- D’Addio AC,
- d’Ercole MM
- ↵.
- Beauchamp JP
- ↵.
- Courtiol A,
- Tropf FC,
- Mills MC
- ↵.
- Fisher RA
- ↵
- ↵
- ↵
- ↵.
- Day FR, et al.
- ↵.
- Arngrímsson R, et al.
- ↵
- ↵.
- Helgason A,
- Pálsson S,
- Gudbjartsson DF,
- Kristjánsson T,
- Stefánsson K
- ↵
- ↵.
- Vilhjálmsson BJ, et al., Schizophrenia Working Group of the Psychiatric Genomics Consortium, Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) study
- ↵.
- Gudbjartsson DF, et al.
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Okbay A, et al.
- ↵
- ↵.
- Charlesworth B
- ↵.
- Cavalli-Sforza LL,
- Bodmer WF
- ↵
- ↵.
- Flynn J
- ↵.
- Rietveld CA, et al.
- ↵
- ↵
- ↵.
- Hazan MZH
- ↵
- ↵.
- Wechsler D
- ↵.
- Gudmundsson E
- ↵
Citation Manager Formats
Article Classifications
- Biological Sciences
- Evolution
- Social Sciences
- Social Sciences