# A century of educational inequality in the United States

See allHide authors and affiliations

Edited by Eric Grodsky, University of Wisconsin-Madison, Madison, WI, and accepted by Editorial Board Member Mary C. Waters June 3, 2020 (received for review April 27, 2019)

## Significance

There has been widespread concern that the takeoff in income inequality in recent decades has had harmful social consequences. We provide evidence on this concern by assembling all available nationally representative datasets on college enrollment and completion. This approach, which allows us to examine the relationship between income inequality and collegiate inequalities over the full century, reveals that the long-standing worry about income inequality is warranted. Inequalities in college enrollment and completion were low for cohorts born in the late 1950s and 1960s, when income inequality was low, and high for cohorts born in the late 1980s, when income inequality peaked. This grand U-turn means that contemporary birth cohorts are experiencing levels of collegiate inequality not seen for generations.

## Abstract

The “income inequality hypothesis” holds that rising income inequality affects the distribution of a wide range of social and economic outcomes. Although it is often alleged that rising income inequality will increase the advantages of the well-off in the competition for college, some researchers have provided descriptive evidence at odds with the income inequality hypothesis. In this paper, we track long-term trends in family income inequalities in college enrollment and completion (“collegiate inequalities”) using all available nationally representative datasets for cohorts born between 1908 and 1995. We show that the trends in collegiate inequalities moved in lockstep with the trend in income inequality over the past century. There is one exception to this general finding: For cohorts at risk for serving in the Vietnam War, collegiate inequalities were high, while income inequality was low. During this period, inequality in college enrollment and completion was significantly higher for men than for women, suggesting a bona fide “Vietnam War” effect. Aside from this singular confounding event, a century of evidence establishes a strong association between income and collegiate inequality, providing support for the view that rising income inequality is fundamentally changing the distribution of life chances.

It has long been suspected that the takeoff in income inequality has made the good luck of an advantaged birth ever more consequential for accessing opportunities and getting ahead. The “income inequality” hypothesis proposes that intergenerational inequality—with respect to educational attainment, social mobility, and other socioeconomic outcomes—will increase as income inequality grows. Because this hypothesis shot to public attention with Krueger’s (1) discussion of the Great Gatsby curve, the proposition that high levels of income inequality have generated correspondingly high levels of intergenerational reproduction is now a staple of public and political discourse. Despite the prominence of this argument, the evidence in its favor is less overwhelming than might be assumed (2), and is largely limited to the empirical result that intergenerational income inheritance has increased in recent decades, at least in some analyses (3, 4). Even this result has been contested and is far from widely accepted (5).

In this paper, we assess the plausibility of the income inequality hypothesis by examining changes over the past century in the income-based gaps in college enrollment and completion. This is a field in which descriptive evidence is key: Designs that would allow for convincing causal inference are in short supply, and where designs are available, the data are not. And yet most of the descriptive evidence in regard to the college level pertains only to recent decades, when both income inequality and collegiate inequalities have increased (refs. 6⇓–8).

The trends through earlier decades of the century, within which the great U-turn in income inequality occurred, remain largely undocumented. To overcome this evidence deficit, we might be inclined to draw on evidence on other educational outcomes, such as test scores and years of schooling. Reardon’s analysis of family income test score gaps, for example, shows steadily rising gaps between cohorts born in the 1940s and those born in the present day (ref. 9; cf. ref. 10). But test scores are quite imperfectly correlated with educational attainment, and evidence from studies of inequalities in years of schooling would support different conclusions on trend. Hilger’s (11) analysis of long-term trends using Census data shows that there was a decline in the effects of parental income on child’s education between the 1940s and 1970s, while Mare (12) shows an increasing effect of family income on higher-level educational transitions for midcentury cohorts as compared to early-century cohorts. Taking these studies together, it is difficult to reach any firm conclusion about the income inequality hypothesis, as one might infer an increase, a decrease, or stability in collegiate inequalities during the midcentury, depending on which study is considered.

Extending the time series over the whole of the past century allows for a fuller assessment of the income inequality hypothesis, as the long-run historical series on income inequality exhibits a relatively complicated pattern, as opposed to the simple increase in the recent period. In much the same way as the magnitude of changes in income inequality could only be appreciated when considered in the long run, current levels of educational inequality must be evaluated and understood in full historical context (13). In a comprehensive extension of previous research on collegiate inequalities, we thus use all nationally representative data sources that we were able to locate and access. This strengthens the descriptive evidence that can be brought to bear upon the income inequality hypothesis.

In the following sections, we discuss the available data and the methods of analysis, and present our results on long-term trends in collegiate inequalities. We will focus on inequalities in completion of 4-year college, enrollment in 4-year college, and enrollment in any college (2- or 4-year). We will demonstrate an essential similarity in inequality trends across the range of collegiate outcomes. Although we will show that income inequality is strongly associated with inequalities at the college level, we will also highlight that it is not the only force at work.

## College Enrollment and Completion in the Twentieth Century

The twentieth century was the first century in which education systems were widely diffused and, at least in principle, accessible to all social groups. The century witnessed substantial expansion at the college level: The college enrollment rate for 20- to 21-y-olds increased from around 15% for the mid-1920s birth cohorts to almost 60% for cohorts born toward the end of the century.* As Fig. 1 shows, rates of enrollment rose rapidly for cohorts born in the early century to midcentury, and flattened out and even declined for the midcentury birth cohorts, before resuming a steady increase for cohorts born in the later decades of the century.

We see in Fig. 1 a stark reversal of the gender gap in college enrollment; for birth cohorts from the mid-1950s to mid-1990s, the proportion of women enrolled in college grew by around 30 percentage points, while the corresponding increase for men was just under 20 percentage points (16, 17). The reversal occurred immediately after the rapid increase in enrollment rates observed for male birth cohorts at risk for service in the Vietnam War (16). A literature in economics has demonstrated that men born in the 1940s and 1950s were unusually likely to attend and graduate from college, although there is disagreement with respect to whether the observed increase in men’s college participation rates should be attributed to draft avoidance or to postservice GI Bill enrollments (ref. 18; cf. ref. 19).

Alongside trends in college enrollment, Fig. 1 presents rates of college completion by type of degree. While rates of completion of 2-year college are rather flat for cohorts born from the 1950s onward, rates of 4-year college completion have increased considerably. As the figure suggests, rates of 4-year college completion are highly correlated with rates of enrollment, but research shows that, over the past half-century, rates of college completion increased less sharply than rates of enrollment, because the college dropout rate increased (6, 20).

## Materials and Method

Although it is relatively straightforward to examine changes in rates of college enrollment and completion over time, it is rather less straightforward to examine income inequalities in collegiate outcomes across the span of the twentieth century, because data on parental income, college enrollment, and college completion are not routinely collected in government surveys. We must therefore piece together the trends in collegiate inequalities through the analysis of available sources of nationally representative data. We include results from the analysis of both cross-sectional surveys of adults and longitudinal surveys beginning with school-aged children, and, for a number of recent cohorts, we calculate estimates from tax data results in the public domain. Although this approach presents obvious challenges as regards comparability of data sources and measures, for much of the period that we cover, we have multiple estimates of collegiate inequalities for any given period of time. The datasets and their key characteristics are listed in Table 1; detailed descriptions of each dataset are included in *SI Appendix*.

The datasets cover cohorts born between 1908 and 1995, and it is only at the beginning and the end of the data series that our birth cohorts are represented by no more than one dataset. Although we aim to define cohorts according to year of birth, for some of the datasets we must construct quasi-cohorts based on age or grade, because year of birth was not recorded.

The biggest constraint that we face in analyzing income inequalities in collegiate attainment relates to gender. Data on the earlier birth cohorts come from the Occupational Changes in a Generation (OCG 1973) survey, which was administered in conjunction with the Current Population Survey (21). This survey was completed by men only, so we lack information on the educational attainment of women in the earliest birth cohorts. By presenting all results separately for men and women, patterns over time can be compared by gender.

The datasets were prepared to provide consistent measures of family income, college enrollment, and college completion. We produce simple binary variables that capture whether an individual completed a 4-year degree, whether an individual enrolled in (without necessarily completing) a 4-year degree program, and whether an individual enrolled in (without necessarily completing) a college program. Unfortunately, the tax data results pertain only to college enrollment per se, so we have fewer available data points for the analyses of 4-year completion and enrollment than for the analyses of enrollment in any college program. All samples are restricted to individuals who enrolled in high school, in order to maximize consistency across samples. In *SI Appendix*, we also include results for a smaller sample restricted to high school graduates (*SI Appendix*, Fig. S6).

A more difficult variable to harmonize over time is family income. Although in some datasets family income is measured directly (e.g., annual net family income in dollars), in many of the available datasets family income is measured only as an ordinal variable. For these datasets, we employ the method used by Reardon (9) to calculate test score gaps from coarsened family income data; the method uses the proportions in each income category to assign an income rank to all of those in a given category, and income rank is then the explanatory variable in the analysis (*SI Appendix*, *SI Methods*).

We estimate logits predicting college enrollment and completion as a function of family income or income rank. Following Reardon (9), we fit squared and cubed terms to capture the nonlinear effects of income rank. Using the model, we estimate the enrollment and completion rates of those at the 90th percentile of family income and those at the 10th percentile. We choose the 90 vs. 10 comparison over other ways of defining inequality because it accords with past assessments and with the main source of trend in income inequality (9).^{†} From these rates, we calculate log-odds ratios capturing, for example, the log-odds of completing a 4-year college degree for the 90 vs. 10 family income comparison.

We would be remiss if we did not note the difficulty in measuring family income reliably, particularly using one-shot measures, which are all that are available in almost all of the datasets that we analyze. Further worries might arise because some of the income measures are retrospective, or because the questions are asked of children, not parents. Although we would not minimize the danger of retrospection or of using children’s reports of family income, evidence suggests that child reports of parental socioeconomic characteristics are not substantially worse than parental reports of those characteristics (9, 22). Furthermore, the types of errors that individuals make when reporting income appear to have changed very little over time (23), which is the key issue when mapping trend. To address concerns about the varying quality of the family income data, we multiply all log-odds ratios by *SI Appendix*, Table S5 for reliability estimates) (9).

We recognize that “researcher degrees of freedom” are of particular concern when presenting results from a large number of datasets (24). We provide additional results based on alternative specifications, in *SI Appendix*, and make our analysis code publicly available on Open Science Framework, https://osf.io/jxne5.

## The Great U-turn in Collegiate Inequality

We now examine collegiate inequalities for cohorts born between 1908 and 1995. Given data constraints, we are limited to examining inequalities over the whole period for men only, but we present results for women for a more limited range of birth cohorts.

In Fig. 2 we present, for the full male series, the estimated probabilities of completing 4-year college at the 90th and 10th percentiles of family income.^{‡} We see in Fig. 2 that the increase in 4-year college degree attainment over the twentieth century was far from equally distributed across income groups. Men from the 90th percentile of family income were at the leading edge of the expansion; the figure shows a rapid increase in college completion rates through the 1940s birth cohorts, then a tailing off through the 1950s cohorts, followed by a further rapid increase for those cohorts born in the 1960s onward. In contrast, expansion at the bottom of the income distribution was more sluggish; 4-year college completion rates at the 10th percentile were less than 10 percentage points higher for cohorts born at the end of the century than for cohorts born at the beginning.

Fig. 2 shows that absolute differences in completion rates between income groups increased from the beginning to the end of the century. But this important result must be considered alongside changes over the century in the overall completion rate (12). Although the probability gap was small at the beginning of the century, the odds of college completion were around 7 times higher for the rich than for the poor, because the rich were able to secure a large proportion of the limited number of college slots. In relative terms, the poor born in the early century were more disadvantaged than their counterparts born in the 1960s, when 90 vs. 10 gaps in the probability of college completion were substantially larger. Although both probability gap and odds-ratio measures are informative, we focus from this point forward on odds-ratio measures of educational inequality, which are margin insensitive and thus feature relative—rather than absolute—advantage. But, in *SI Appendix*, we present probability plots for the three collegiate outcomes (*SI Appendix*, Fig. S1), and include analyses based on probability gaps in *SI Appendix*, Table S3. The key results hold for both types of analysis.

We plot, in Fig. 3, the 90 vs. 10 log-odds ratios describing inequalities in collegiate outcomes for each of the datasets in our analyses, with trends estimated from generalized additive models (GAM). The GAMs are fitted to the plotted data points, with each point weighted by the inverse of the SE for the estimate.^{§} In the earlier period covered by OCG, we fit the model to the estimates derived from analyses of single birth cohorts, but present point estimates representing groups of birth cohorts to show the consistency across these specifications. Confidence intervals are presented in *SI Appendix*, Fig. S2; figures showing 90 vs. 50 and 50 vs. 10 inequalities are included as *SI Appendix*, Figs. S3 and S4.

We focus first on describing the trends for men, for whom we have results spanning the whole century. It is clear from Fig. 3 that the over-time trends are similar across the various collegiate outcomes and, further, that there is no simple secular trend for any of the outcomes under consideration. There are three key attributes of the trends that should be emphasized.

First, Fig. 3 shows that, toward the middle of the century, there was a great U-turn in collegiate inequality. Inequalities fell rapidly for cohorts born in the early to mid-1950s, then bottomed out until the mid-1960s, before ultimately rising steeply for cohorts born from the mid-1960s onward. The U-turn appears to be more pronounced for 4-year and “any college” enrollment than for completion of a 4-year degree, but it is present for all of the collegiate outcomes under consideration.

Had we measured collegiate inequalities in but a single dataset, we might be skeptical that our observed trend was on the mark and, in particular, that there was a rapid fall in inequality for the midcentury birth cohorts. But this trend is supported across all of the datasets from the period: OCG and National Longitudinal Study (NLS) Young Men show high inequality in the early 1950s; Panel Study of Income Dynamics (PSID), NLS72, and High School and Beyond (HS&B) pick up the lower inequality of the mid-1950s to the mid-1960s; and the subsequent uptick in inequality is captured in PSID, the school cohort surveys, and the National Longitudinal Studies of Youth (NLSY79&97). Indeed, Fig. 3 demonstrates that there is great consistency across a large number of different data sources.^{¶} At the trough, inequality in 4-year college completion was reduced to a log-odds ratio of around 1.5, indicating that, even in this low-inequality period, the odds of those at the 90th income percentile completing a 4-year college degree were almost 4.5 times greater than the equivalent odds for those at the 10th percentile. Inspection of *SI Appendix*, Fig. S3 suggests that the U-turn observed in Fig. 3 is largely driven by changes in the top half of the income distribution: the U-turn is rather more pronounced for the 90 vs. 50 comparison than for the 50 vs. 10 comparison.

Second, if skepticism about a midcentury fall in collegiate inequality were to be sustained, suspicion would also have to fall upon all currently accepted results on over-time trends, which demonstrate a substantial increase in inequalities in college enrollment and completion between cohorts born in the midcentury and late century. If we were to impose a simple linear smooth on the century-long data series, this would indicate relatively modest increases in collegiate inequalities over the period taken as a whole (see dashed lines, Fig. 3).^{#} Again, because the trends are mapped using multiple datasets, we are confident that the pattern of a U-turn in collegiate inequality is supported.

Third, any evidence of a U-turn must bring to mind the pattern of income inequality over the past century. As Piketty and Saez (27) described, toward the middle of the twentieth century, the share of income going to the top 10% rapidly declined, before rising again over the later decades of the century. The U-turn in collegiate inequality mimics this trend, although it is notable that, insofar as we see similarity in patterns of income inequality and collegiate inequalities, it is income inequality around year of birth that appears to matter most. But, despite the obvious similarities, there is at least one clear divergence in the pattern of collegiate inequality and income inequality: The U-turn in collegiate inequality comes very late. Income inequality begins to fall in the early 1940s, but inequalities in enrollment and completion begin to decline only for cohorts born in the mid-1950s. Men born in the mid-1940s onward were not just born into a period of low inequality, but they spent most of their formative years in a low-inequality society. Despite this, the evidence shows that collegiate inequality increased substantially for the cohorts born in the 1940s and early 1950s; the log-odds ratios describing inequality are increased by around a third over this short period.

Some of the same key features are visible in the results for women, shown in Fig. 3, *Right*, although we only have access to data for women born after 1950. We see a basic similarity with the men’s analyses from the mid-1950s birth cohorts onward: Collegiate inequalities are relatively flat for the 1950s to 1960s birth cohorts, and increase for women born in the 1970s and onward. Just as with men, toward the end of the period we see flat and even declining inequalities in enrollment and completion. There are perhaps some subtle differences in the pattern by gender—the upturn in collegiate inequality begins, for example, several years later for women than for men—but we have little evidence here to support a conclusion of substantial difference in inequality for men and women over this period.

There is one notable difference between the men’s and women’s results, relating to the period when trends in male collegiate inequality substantially diverged from trends in income inequality. This exceptional period appears to be exceptional for men, but not for women. Although we cannot track collegiate inequalities for women across the whole midcentury period, the first data points in the female data series (NLS Young Women: 1951–1953 birth cohorts) are lower than the nearby estimates for men (NLS Young Men: 1949–1951 birth cohorts).** This period of divergence between collegiate inequality and income inequality coincides with the period that we identified above as holding special consequences for men’s educational attainment: Men born in the 1940s and early 1950s were subject to the threat of military service in the Vietnam War.

There are no cohort studies of women that would allow us to compare male and female inequalities in college enrollment and completion throughout this period. We do, however, have access to data on men who fathered children who were at risk for service during the Vietnam War: The NLS Older Men survey can be used to track collegiate inequalities for the children of men who were aged 45 y to 59 y in 1966. The structure of this dataset is somewhat different from the datasets underlying our time series, but we nevertheless find confirmation, in Fig. 4, that male and female inequalities diverged in the Vietnam years.

In the pre-Vietnam period, male and female collegiate inequalities were of similar magnitude. The log-odds ratio for 4-year enrollment, for example, was 2.3 for men (95% CI: 1.5, 3.1), as compared to 2.4 for women (1.7, 3.2). But, for the birth cohorts at risk for serving in Vietnam, the male log-odds ratio increased slightly, to 2.5 (1.8, 3.2), while inequality fell substantially for women, to 1.4 (0.8, 2.0) (see *SI Appendix*, Fig. S8 for a figure with CIs). These results provide support for the claim that men’s collegiate inequality was substantially and artificially raised relative to expected levels during this period because of the Vietnam War. Unfortunately, our data are not well-suited to evaluating why male and female collegiate inequality differed in the Vietnam period. But some evidence can be brought to bear on this question by comparing preservice and postservice inequalities in college participation for the men in OCG (*SI Appendix*, Fig. S9). These data are more consistent with a draft-induced increase in male collegiate inequality than with a GI Bill-induced increase.^{††}

Bringing the results in Fig. 4 together with what is known about college enrollment and completion patterns during the Vietnam War period, it seems likely that the disproportionate increase in men’s college participation rates observed in Fig. 1 was achieved, at least in part, through a gender-specific change in the effect of family income on college enrollment and completion.

## The Association between Income Inequality and Collegiate Inequality.

We now present a formal statistical test of the strength of the association between income inequality and collegiate inequality. We regress the log-odds for collegiate inequalities on income inequality, as measured through the share of wages going to the top 10% (27).^{‡‡} In addition to the income inequality variable, for the full male series (1908–1995), we fit a “Vietnam effect,” with a dummy variable that isolates the cohorts at risk from the draft lotteries (i.e., 1944–1952 birth cohorts). We fit models to the full male series (1908–1995 birth cohorts), a compressed male series (1952–1995 birth cohorts), and the female series (1951–1995 birth cohorts). A full regression table with coefficients and standard errors is included as *SI Appendix*, Table S4.^{§§} In Fig. 5, we present estimates of the predicted increase in the log-odds ratios for an eight percentage point increase in the share of wages going to the top 10%; this increase is equivalent to the “takeoff” in income inequality that occurred between the midcentury and the 1990s.^{¶¶}

The regression coefficients describing the associations between income inequality and 90 vs. 10 collegiate inequalities can be straightforwardly decomposed into two parts: an association between income inequality and the 90 vs. 50 log-odds ratio, and an association between income inequality and the 50 vs. 10 log-odds ratio. In Fig. 5, the total height of each bar represents the predicted increase in the 90 vs. 10 log-odds ratio for an eight percentage point increase in income inequality, while the dark and light gray bars show the predicted increases in the 90 vs. 50 and 50 vs. 10 log-odds ratios, respectively.

Examining first the results for the 90 vs. 10 comparison, we see confirmation of a relatively strong association between income inequality and collegiate inequality over the full sweep of the twentieth century. For women, for example, the model predicts that an increase in income inequality equivalent to that observed in the takeoff period would increase the 90 vs. 10 log-odds ratio by around 1 for 4-year enrollment and completion, and by around 1.3 for enrollment in any college. Although there is variation in the strength of the association for the different outcome measures, the income inequality effects are large and positive in all of the analyses, indicating substantial support for the income inequality hypothesis.

Given that the takeoff in income inequality was largely characterized by the top of the income distribution moving away from the middle and bottom of the distribution, the income inequality hypothesis would predict larger effect sizes for the 90 vs. 50 comparison than for the 50 vs. 10 comparison. When we decompose the 90 vs. 10 results into 90 vs. 50 and 50 vs. 10 components, we see precisely this result. The income inequality effects for the 90 vs. 50 comparisons in all cases outweigh those for the 50 vs. 10 comparisons, particularly in the analyses of 4-year college enrollment and completion.

But the results also provide grounds for exercising caution when interpreting differences in effect sizes across the models, as the effect sizes in the full and compressed male series are more similar for the “any college” analyses than for the 4-year analyses, where the sample sizes are smaller. Even when analyzing all available datasets and exploiting the full range of variation in income inequality over the century, our statistical power is limited. This is even more clear when we extend the models summarized in Fig. 5 to include additional macro-level regressors that social scientists have previously used to predict inequalities at the college level. These additional variables include the economic returns to schooling, which are assumed to influence individual decisions about whether or not to invest in college education (33), and the high school graduation rate, which has been shown to influence educational expansion at the college level (34). As shown in *SI Appendix*, Table S1, estimates from these models are more volatile, particularly for women.

The volatility arises because some of our analyses are, like past analyses, limited to more recent cohorts in which the takeoff assumes a monotonically increasing form. This makes it difficult to adjudicate between the large number of monotonically increasing potential causes. An important advantage of our full-century approach is that it reaches back to a time in which these competing causes did not always move together. In Fig. 6, we present the results of a simulation exercise, in which we run 1,000 regressions for a range of different model specifications on the full and compressed male series, with each regression including a new variable containing random numbers drawn from a normal distribution (μ = 0; σ = 1). We examine the stability of the income inequality effects with respect to inequality in college enrollment, for which we have the largest number of data points. We add to the basic model in Fig. 5 controls for time, either in the form of 1) a linear effect of year or 2) dummies for decades, and measures of the returns to schooling (33, 35, 36) and the high-school graduation rate (34, 37).

As Fig. 6 shows, the income inequality effects estimated for the full male series are robust to the inclusion of other potential confounding variables. But Fig. 6 also highlights the extent to which a proper evaluation of the income inequality hypothesis requires researchers to exploit all of the available data. Although the bivariate analysis shows a similar effect of the income inequality variable in both the full and compressed series, the effects are a good deal more volatile in the more highly parameterized models in the compressed relative to the full series.*** The substantive implication of this analysis is clear: It is only with the full data series that we obtain relatively precise and reliable estimates of the association between inequality in collegiate outcomes and income inequality.

## Discussion

We have examined descriptive evidence on the association between inequality in collegiate attainment and income inequality over the past century. Although there has been much recent interest in the income inequality hypothesis, it has been difficult to make headway because commonly used datasets pertain only to recent decades, when income inequality was increasing. We have thus proceeded by reaching back to the very beginning of the twentieth century, assembling all of the available datasets, and harmonizing the variables in these datasets.

The results show that collegiate inequalities and income inequality are, in fact, rather strongly associated over the twentieth century. Just as with income inequality, we see evidence of a U-turn in 90 vs. 10 collegiate inequality, and evidence of a substantial takeoff in collegiate inequalities in recent decades. When we examine trends in 90 vs. 50 and 50 vs.10 inequalities, we find that the 90 vs. 50 trends mirror the 90 vs. 10 results. Taken together, our results offer solid descriptive support for the income inequality hypothesis.

Inequalities in collegiate attainment increased hand in hand with the expansion of college education in the United States. Rates of college enrollment and completion were higher at the end of the century than they had been at any time in the preceding hundred years, and yet, for these birth cohorts, we see substantial inequalities, as captured in both percentage point gap and odds ratio measures. In point of fact, the only time during the twentieth century for which we observe a reduction in educational inequality is during the period when expansion at the college level had paused. Although the counterfactual is obviously not observable, these results emphasize the importance of attending to the distribution of college opportunities in addition to overall levels of attainment. These distributional questions will take on even greater significance in the context of the economic and social crisis engendered by coronavirus disease 2019, a crisis that is likely to have enduring effects on both the distribution of income and access to the higher education sector.

Our analyses are not well suited to evaluating the mechanisms generating the association between income inequality and collegiate inequalities. However, given the pattern of collegiate inequality across the century, we suspect that a mechanical effect is likely to be responsible. If money matters, as we know it does, and growing income inequality delivers more money to the top, then, all else being equal, these additional dollars would in themselves produce growing inequality in college enrollment and completion. The mechanical effect is therefore a parsimonious account of the trend that we see here (8). That the over-time associations are substantially stronger for the 90 vs. 50 comparison as compared to the 50 vs. 10 comparison provides further suggestive evidence in this regard. Nevertheless, there is a period for which we undoubtedly hypothesize an increase in the relational effect of income: the Vietnam War. For the war to lead to increased collegiate inequality, the effect of income on educational attainment would have to increase, particularly given that income inequality was low and stable for these birth cohorts.

Whatever the mechanisms may be, the key descriptive result is that, over the course of the twentieth century, a grand U-turn in collegiate inequality occurred. Cohorts born in the middle of the century witnessed the lowest levels of inequality in college enrollment and completion seen over the past hundred years. Contemporary birth cohorts, in contrast, are experiencing levels of collegiate inequality not seen for generations.

## Data Availability.

The analysis code and auxiliary data required to produce the figures and tables in this paper can be accessed at https://osf.io/jxne5. Code to produce estimates for each of the individual datasets (see Table 1) is also provided. Details on how to access these datasets are provided in *SI Appendix* (most datasets are available for download upon registration with the data provider, while others are accessible only with a restricted use license from the National Center for Education Statistics).

## Acknowledgments

We thank David Cox, David Grusky, and Florencia Torche for their detailed comments on earlier versions of this paper, and also Raj Chetty, Maximilian Hell, Robb Willer, the Cornell Mobility Conference, the Stanford Inequality Workshop, the Stanford Sociology Colloquium Series, and University of California, Los Angeles’s California Center for Population Research seminar for useful suggestions. Additionally, we thank Stanford’s Center for Poverty and Inequality, Russell Sage Foundation and Stanford’s United Parcel Service (UPS) Fund for research funding, Stanford’s Institute for Research in the Social Sciences for secure data room access, and the American Institutes for Research for data access. We are grateful to the editor and reviewers for their helpful and productive suggestions.

## Footnotes

- ↵
^{1}To whom correspondence may be addressed. Email: mvjsoc{at}stanford.edu.

Author contributions: M.J. and B.H. designed research; M.J. and B.H. analyzed data; and M.J. wrote the paper.

The authors declare no competing interest.

This article is a PNAS Direct Submission. E.G. is a guest editor invited by the Editorial Board.

Data deposition: Code for data analysis is archived on Open Science Framework (https://osf.io/jxne5).

↵*Throughout this paper, we use the term “college” as a shorthand for “2- or 4-year college.”

↵

^{†}We also include results based on comparing income quartiles in*SI Appendix*, Fig. S5.↵

^{‡}The probabilities are estimated from the logit model, and we fit a GAM to establish trend. See*SI Appendix*,*SI Methods*for more details.↵

^{§}We determine the appropriate number of degrees of freedom for the trend lines by fitting a series of GAMs and comparing model fit (using the Akaike Information Criterion). For the analysis of college enrollment for male birth cohorts, we use the stepwise model builder in R’s*gam*package to find the best-fitting model (25, 26). As we have fewer point estimates in the other analyses, the stepwise approach is less reliable, and we therefore choose smoothing parameters that provide a reasonable (and conservative) summary of the trend.↵

^{¶}It is also clear that some datasets are outliers from the trend. It is not surprising to see variation across samples, and we highlight this variation only because it illustrates a potential danger of using but one or two datasets to establish a trend. The estimates for National Education Longitudinal Study (NELS) (1974), for example, are substantially higher than the surrounding estimates based on one-shot income measures, and there is a surprising degree of cross-cohort volatility in the PSID estimates.↵

^{#}The linear trend is strongest for 4-year completion, and weakest for enrollment in 4-year college. For all collegiate outcomes, the GAM offers a significant improvement in fit over the simple linear model.↵**It would be possible to track male and female educational inequality with respect to parental education or socioeconomic index scores (SEI) (28), but the sample sizes are, unfortunately, too small for a detailed analysis of gender differences in educational attainment by birth cohort. This approach is also unattractive given that parental education, parental income, and SEI were only weakly correlated in this period (29).

↵

^{††}Note that, while previous research has suggested that high-socioeconomic status (SES) individuals might have taken advantage of the GI Bill to a greater extent than low-SES individuals (30),*SI Appendix*, Fig. S9 provides little evidence that collegiate inequality was substantially affected. See*SI Appendix*for further discussion of this point.↵

^{‡‡}We choose the wages measure because, for the bottom of the income distribution, wages are a more important component of income than the types of income included in the alternative measures (e.g., capital gains). We measure wage inequality in year of birth. Surprisingly, given the prominence of the income inequality hypothesis, there is not yet adequate guidance in the literature as to the age at which income inequality most influences outcomes, although in the “money matters” literature there has been particular emphasis on the prenatal period, the postnatal period, and early childhood as the lifecourse moments when money matters most (31, 32).↵

^{§§}In the 4-year analyses, we weight the data by the inverse of the standard errors underlying the estimates. In the analysis of any college enrollment, we do not weight the data, as this data series includes the tax data estimates. Given the size of the samples underlying these estimates, weighting would allow the relationship that pertains in the tax data for cohorts born in the 1980s and 1990s to have a disproportionate influence on the estimated century-long relationship between income inequality and inequality in college enrollment.↵

^{¶¶}The estimates in Fig. 5 are obtained by multiplying the income inequality coefficients in*SI Appendix*, Table S4 by 0.08.↵***See

*SI Appendix*, Fig. S10 for similar figures for 4-year enrollment and completion.This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1907258117/-/DCSupplemental.

- Copyright © 2020 the Author(s). Published by PNAS.

This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND).

## References

- ↵
- A. B. Krueger

*Speech at the Center for American Progress*(Washington, DC, 12 January 2012). https://cdn.americanprogress.org/wp-content/uploads/events/2012/01/pdf/krueger.pdf. Accessed 8 July 2020. - ↵
- ↵
- J. Davis,
- B. Mazumder

- ↵
- ↵
- ↵
- G. J. Duncan,
- R. J. Murnane

- M. J. Bailey,
- S. M. Dynarski

- ↵
- ↵
- G. J. Duncan,
- A. Kalil,
- K. M. Ziol-Guest

- ↵
- G. J. Duncan,
- R. J. Murnane

- S. F. Reardon

- ↵
- E. A. Hanushek,
- P. E. Peterson,
- L. M. Talpey,
- L. Woessmann

- ↵
- N. G. Hilger

- ↵
- R. D. Mare

- ↵
- D. Hirschman

- ↵
- Census Bureau

- ↵
- S. Flood,
- M. King,
- R. Rodgers,
- S. Ruggles,
- J. R. Warren

*Integrated Public Use Microdata Series (IPUMS), Current Population Survey: Version 6.0*. IPUMS, 2018. https://cps.ipums.org/cps/. Accessed 8 July 2020. - ↵
- ↵
- T. A. DiPrete,
- C. Buchmann

- ↵
- ↵
- J. D Angrist,
- S. H. Chen

- ↵
- ↵
- D. L. Featherman,
- R. M. Hauser

- ↵
- R. M. Hauser,
- M. Andrew

- ↵
- J. C. Moore,
- L. L. Stinson,
- E. J. Welniak

- ↵
- ↵
- J. M. Chambers,
- T. J. Hastie

- T. J. Hastie

- ↵
- T. J. Hastie

- ↵
- ↵
- G. J. Duncan,
- R. J. Murnane

- M. Hout,
- A. Janus

- ↵
- O. D. Duncan,
- D. L. Featherman,
- B. Duncan

- ↵
- ↵
- ↵
- H. Hoynes,
- D. Whitmore Schanzenbach,
- D. Almond

- ↵
- C. Goldin,
- L. F. Katz

- ↵
- ↵
- ↵
- C. R. Hulten,
- V. A. Ramey

- R. G. Valletta

- ↵
- T. D. Snyder,
- C. de Brey,
- S. A. Dillow

## Citation Manager Formats

## Article Classifications

- Social Sciences
- Social Sciences