## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Bayesian model reveals latent atrophy factors with dissociable cognitive trajectories in Alzheimer’s disease

Edited by James L. McClelland, Stanford University, Stanford, CA, and approved August 23, 2016 (received for review July 14, 2016)

## Significance

Alzheimer’s disease (AD) affects 10% of the elderly population. The disease remains poorly understood with no cure. The main symptom is memory loss, but other symptoms might include impaired executive function (ability to plan and accomplish goals; e.g., grocery shopping). The severity of behavioral symptoms and brain atrophy (gray matter loss) can vary widely across patients. This variability complicates diagnosis, treatment, and prevention. A mathematical model reveals distinct brain atrophy patterns, explaining variation in gray matter loss among AD dementia patients. The atrophy patterns can also explain variation in memory and executive function decline among dementia patients and at-risk nondemented participants. This model can potentially be applied to understand brain disorders with varying symptoms, including autism and schizophrenia.

## Abstract

We used a data-driven Bayesian model to automatically identify distinct latent factors of overlapping atrophy patterns from voxelwise structural MRIs of late-onset Alzheimer’s disease (AD) dementia patients. Our approach estimated the extent to which multiple distinct atrophy patterns were expressed within each participant rather than assuming that each participant expressed a single atrophy factor. The model revealed a temporal atrophy factor (medial temporal cortex, hippocampus, and amygdala), a subcortical atrophy factor (striatum, thalamus, and cerebellum), and a cortical atrophy factor (frontal, parietal, lateral temporal, and lateral occipital cortices). To explore the influence of each factor in early AD, atrophy factor compositions were inferred in beta-amyloid–positive (Aβ+) mild cognitively impaired (MCI) and cognitively normal (CN) participants. All three factors were associated with memory decline across the entire clinical spectrum, whereas the cortical factor was associated with executive function decline in Aβ+ MCI participants and AD dementia patients. Direct comparison between factors revealed that the temporal factor showed the strongest association with memory, whereas the cortical factor showed the strongest association with executive function. The subcortical factor was associated with the slowest decline for both memory and executive function compared with temporal and cortical factors. These results suggest that distinct patterns of atrophy influence decline across different cognitive domains. Quantification of this heterogeneity may enable the computation of individual-level predictions relevant for disease monitoring and customized therapies. Factor compositions of participants and code used in this article are publicly available for future research.

- mental disorder subtypes
- Alzheimer’s disease subtypes
- Alzheimer’s disease heterogeneity
- voxel-based morphometry
- unsupervised machine learning

Alzheimer’s disease (AD) dementia is a devastating neurodegenerative disease that affects 11% of individuals over age 65 with no disease-modifying treatment available. Accurate in vivo biomarkers are urgently needed to assist in early detection of at-risk individuals, improve diagnosis, monitor disease progression, and serve as outcome measures in clinical trials.

Although AD is typically associated with an amnestic clinical presentation and disruption of the medial temporal lobe (1), it has become increasingly clear that heterogeneity exists within this disease. Specifically, heterogeneity has been observed in the clinical presentation of AD (2) and the spatial distribution of neurofibrillary tangles (NFTs) (3, 4) as well as the presence of comorbid pathologies, such as vascular disease, Lewy bodies, and transactive response DNA binding protein 43 kDa (TDP-43) (5, 6). Interestingly, the spatial distribution of atrophy varies across AD subtypes defined on the basis of NFT distribution (7), suggesting that analyses of gray matter (GM) patterns are useful to characterize heterogeneity in AD. Furthermore, although distinct atrophy patterns have been observed in patients who clearly show atypical clinical presentations (8), heterogeneity in GM atrophy has also been reported among late-onset AD cases (9). It is, therefore, likely that the ability to quantify varying patterns of atrophy among AD patients will help inform our understanding of fundamental disease processes.

In this study, we sought to explore the heterogeneity of atrophy patterns in late-onset AD using a data-driven Bayesian framework that accounted for and estimated latent AD atrophy factors derived from structural MRI data. The mathematical framework that we used, latent Dirichlet allocation (LDA) (10), has been successfully used to extract overlapping brain networks from functional MRI (11) and metaanalytic data (12, 13). Importantly, this approach does not require the atrophy pattern of an individual to be determined by a single atrophy factor. Instead, the model allows the possibility that multiple latent factors are expressed to varying degrees within an individual. For example, the atrophy pattern of a patient might be 90% owing to factor 1 and 10% owing to factor 2, whereas the atrophy pattern of another patient might be 60% owing to factor 1 and 40% owing to factor 2. Given that multiple contributors that are not mutually exclusive may influence heterogeneity in AD, such as the spatial location of NFT pathology (3, 4), coexisting non-AD pathologies (5, 14), and genetics (9), we believe that it is more biologically plausible that individuals express varying degrees of distinct atrophy factors rather than one single factor. Thus, the LDA approach is particularly well-suited for these analyses and will provide insight into whether expressing multiple atrophy factors is common among late-onset AD patients.

Most studies investigating the heterogeneity of AD have examined patients soon after AD onset or at advanced AD stages (3, 7, 15⇓–17). However, the pathophysiological processes of AD begin at least a decade before clinical diagnosis (18), suggesting that the emergence of this heterogeneity may occur before the onset of clinical dementia. In this study, we, therefore, examined how distinct atrophy factors identified in AD dementia patients were associated with longitudinal cognitive decline early in nondemented participants who were at risk for AD dementia based on elevated beta-amyloid (Aβ) (19⇓–21).

Our study makes three significant contributions. First, we introduced an innovative modeling strategy where expressions of multiple atrophy patterns are estimated rather than assigning each participant to a single subtype. Second, our approach harnesses the rich multidimensional information across all GM voxels, avoiding the need for a priori selection of regions and enabling an in-depth exploration of atrophy patterns. Third, application of this approach to participants spanning the clinical spectrum revealed that latent atrophy factors are associated with distinct memory and executive function trajectories, providing insights into the impact of disease heterogeneity throughout the prolonged course of AD.

## Results

### Overall Approach.

Our approach involved three main steps. In step I, we performed LDA (a Bayesian model) (10) to estimate latent atrophy factors in 188 AD dementia patients and used this model to extract factor compositions in two independent samples of nondemented participants: 147 Aβ+ mild cognitively impaired (MCI) and 43 Aβ+ cognitively normal (CN) participants. In step II, we examined robustness across different analytic approaches and investigated characteristics of the factor compositions across participants. Third, in step III, we examined the associations between atrophy factors and different cognitive domains (memory and executive function). The results of each step are described in detail below.

### Step I. Discovering Latent Atrophy Factors in AD Dementia Patients.

We used the Bayesian LDA model (10) to encode our assumption that a patient expresses one or more latent atrophy factors (Fig. 1). The LDA model was applied to the structural MRI of 188 AD dementia patients. Given the voxelwise GM density values derived from structural MRI (FSL-VBM) (22) and a predefined number of factors *K*, the model is able to estimate the probability that a particular factor is associated with atrophy at a specific spatial location [i.e., Pr(Voxel | Factor) or probabilistic atrophy map of the factor] and the probability that an individual expresses each atrophy factor [i.e., Pr(Factor | Patient) or atrophy factor composition of the individual]. Importantly, resulting atrophy factors were not predetermined but estimated from data (*Materials and Methods*).

An important model parameter is the number of latent atrophy factors *K*. Therefore, we first determined how factor estimation changed from *K* = 2 to 10. Visual inspection of the spatial distribution of each atrophy factor suggested that factor estimates from *K* = 2 to 10 were organized in a hierarchical fashion (Fig. 2 and *SI Appendix*, Fig. S1). For instance, the two-factor model revealed one factor associated with atrophy in temporal and subcortical regions (“temporal+subcortical”) (Fig. 2*A**1*) and another factor associated with atrophy throughout cortex (“cortical”) (Fig. 2*A**2*). The three-factor model resulted in a similar cortical factor (Fig. 2*B**3* and *SI Appendix*, Table S1*C*), whereas the temporal+subcortical factor split into a “temporal” factor associated with extensive atrophy in the medial temporal lobe (Fig. 2*B**1* and *SI Appendix*, Table S1*A*) and a “subcortical” factor associated with atrophy in the cerebellum, striatum, and thalamus (Fig. 2*B**2* and *SI Appendix*, Table S1*B*). Likewise, the four-factor model resulted in the cortical factor splitting into “frontal cortical” and “posterior cortical” factors, whereas the temporal and subcortical factors remained the same (Fig. 2*C*). Sagittal and axial slices of these probabilistic atrophy maps are available in *SI Appendix*, Fig. S1.

To quantify the hierarchical phenomenon, we used an exhaustive search to assess the possibility that two unknown factors in the (*K* + 1)-factor model were subdivisions of an unknown factor in the *K*-factor model (whereas the other factors remained the same). The exhaustive search yielded a hypothesized factor hierarchy with associated correlation values quantifying the subdivision quality (*SI Appendix*, *SI Methods*). The high correlation values (*SI Appendix*, Fig. S2) confirmed that additional factors emerged as subdivisions of lower-order factors, corresponding to a nested hierarchy of atrophy factors.

This nested hierarchy suggested that specification of different numbers of estimated factors might yield distinct insights into AD. In the remainder of this paper, we highlighted the results of three-factor model (Fig. 2*B*), because the emergence of the temporal and cortical factors were consistent with the “limbic-predominant” and “hippocampal-sparing” pathologically defined AD subtypes previously reported (3, 7). We additionally repeated analyses for two- and four-factor models, which yielded behavioral insights consistent with the three-factor model. These additional results are reported in *SI Appendix*, Figs. S6 and S8.

To explore the influence of atrophy factors in early AD, probabilistic atrophy maps Pr(Voxel | Factor) estimated from the AD dementia patients were used to infer factor compositions Pr(Factor | Participant) of 190 Aβ+ nondemented participants using the standard variational expectation–maximization (VEM) algorithm (10).

### Step II. Examining Factor Robustness and Characteristics of Factor Compositions.

Among the 188 AD dementia patients, 100 had their cerebrospinal fluid (CSF) amyloid data available; 91 of 100 patients were Aβ+ (CSF amyloid concentration <192 pg/mL) (23). We performed LDA on the subset of Aβ+ AD dementia patients (and Aβ+ MCI participants) (*SI Appendix*, *SI Results*) and compared atrophy patterns of the resulting factors with those derived using the larger sample (*SI Appendix*, Fig. S3). Atrophy factors were similar across these methods, with an average correlation across all pairwise comparisons of *r* = 0.89. Given this similarity and to improve our estimates of the atrophy factors, we elected to use the atrophy factors derived from the larger sample of 188 AD dementia patients for subsequent analyses. Furthermore, resulting atrophy patterns were consistent between FreeSurfer (24) and FSL-VBM, suggesting that the atrophy factors were robust to variations in image preprocessing software (*SI Appendix*, *SI Results* and *SI Methods*).

To determine whether expression of atrophy factors remained stable over time, we examined the subset of the Alzheimer's Disease Neuroimaging Initiative 1 (ADNI 1) participants who had a two-year follow-up scan available (*n* = 560 of 810). We were specifically interested in whether atrophy factors reflected different disease stages rather than different atrophy subtypes (for instance, high expression of the temporal factor may lessen over time with greater expression of the cortical factor). Therefore, we compared factor compositions after two years with baseline compositions. The factor probabilities were positivity correlated and highly consistent (*r* > 0.85 across all three factors; Fig. 3) (*SI Appendix*, Fig. S4 shows results by diagnostic group with additional amyloid information), suggesting that these factors do not merely reflect a sequence of atrophy patterns.

Examination of atrophy factor compositions among AD dementia patients revealed that the majority expressed multiple latent atrophy factors rather than predominantly expressing a single atrophy factor (Fig. 4). Examination of factor compositions of 190 Aβ+ nondemented participants revealed a similar pattern, such that the majority of participants expressed multiple atrophy factors (*SI Appendix*, Fig. S5*A*). Factor compositions for the two- and four-factor models also suggest that most participants expressed multiple atrophy factors (*SI Appendix*, Fig. S5 *B* and *C*).

To understand the association between atrophy factors and demographic variables, general linear model (GLM; for continuous variables) and logistic regression (for binary variables) were conducted in 188 AD dementia patients (*SI Appendix*, Table S2). Briefly, the response variable was the variable of interest (e.g., age at AD onset), and the explanatory variables consisted of two columns encoding participants’ loading on the cortical and subcortical factors. The temporal factor was implicitly modeled, because the factor probabilities summed to one (*Materials and Methods*).

There were no significant differences in years from AD onset, education, sex, or apolipoprotein E (APOE) ε4 loadings across the three factors. Importantly, amyloid level was not significantly different across factors. The cortical factor was associated with significantly younger baseline age than the temporal factor (*P* = 1e−5) and subcortical factor (*P* = 2e−6) as well as younger age at AD onset than the temporal factor (*P* = 3e−4) and subcortical factor (*P* = 7e−6). In addition, the subcortical factor was associated with higher APOE ε2 loading than the temporal factor (*P* = 0.01) and cortical factor (*P* = 0.04), but these associations were not significant when corrected for multiple comparisons.

Similar analyses were conducted for the Aβ+ MCI and CN groups. The only significant association was that, among Aβ+ MCI participants, the cortical factor was associated with younger age at baseline compared with the temporal factor (*P* = 0.05) and subcortical factor (*P* = 0.02). However, this association did not survive after correcting for multiple comparisons.

### Step III. Examining Associations Between Atrophy Factors and Cognition.

We first examined diagnostic group differences in memory (ADNI-Mem) (25) and executive function (ADNI-EF) (26) without considering factor compositions. As expected, cross-sectional memory was worse for AD dementia patients (mean = −0.84) compared with Aβ+ MCI participants (mean = −0.21; *t* test *P* = 5e−23). Aβ+ MCI participants had worse memory than Aβ+ CN participants (mean = 0.93; *t* test *P* = 2e−26). Likewise, cross-sectional executive function was worse for AD dementia patients (mean = −0.92) compared with Aβ+ MCI participants (mean = −0.17; *t* test *P* = 3e−16). Aβ+ MCI participants had worse executive function than Aβ+ CN participants (mean = 0.50; *t* test *P* = 1e−7).

We then examined a GLM predicting cross-sectional memory and executive function, which included both diagnosis and factor compositions as well as their interactions as predictors (*Materials and Methods* shows model details; Fig. 5 *A**1* and *B**1*). This analysis revealed that all factors were associated with baseline memory, and these associations continued to worsen across the disease spectrum (Fig. 5*A**1*). For cross-sectional executive function, there was only an association with the cortical factor, and this association also worsened across the disease spectrum (Fig. 5*B**1*).

Next, we examined a linear mixed effects (LME) model predicting longitudinal change in memory and executive function (Fig. 5 *A**2* and *B**2*). The LME model provides significantly improved exploitation of longitudinal measurements (27) by accounting for both intraindividual measurement correlations and interindividual variability. The model setup was the same as the GLM above, except that time and its interactions with diagnosis and factor compositions were included as predictors (*SI Appendix*, *SI Methods*).

This analysis revealed that the temporal and subcortical factors exhibited memory decline that began in CN and maintained similar memory decline rates in MCI and AD (Fig. 5*A**2*). In contrast, the cortical factor was not associated with memory decline in CN but showed faster decline in MCI compared with CN and AD compared with MCI (Fig. 5*A**2*). The cortical factor was not associated with executive function decline in CN but showed faster longitudinal executive function decline in MCI compared with CN and AD compared with MCI (Fig. 5*B**2*).

In our final set of analyses examining cognition, we directly compared the three factors. The GLM and LME model were exactly the same as the previous sections, but we instead focused on the contrasts between factors.

For cross-sectional memory, the temporal factor was associated with worse performance than the subcortical (*P* = 3e−6) and cortical (*P* = 7e−3) factors among AD dementia patients (Fig. 6*A*). Similar results were found for Aβ+ MCI participants (*SI Appendix*, Fig. S7*A**1*). Among Aβ+ CN participants, there was no memory difference across the atrophy factors (*SI Appendix*, Fig. S7*A**1*). For cross-sectional executive function, the cortical factor was associated with worse performance than the temporal (*P* = 0.01) and subcortical (*P* = 1e−5) factors among AD dementia patients (Fig. 6*B*). There was no executive function difference across the factors among Aβ+ CN and MCI participants (*SI Appendix*, Fig. S7*B**1*).

For longitudinal change in memory (Fig. 7*A*), the cortical factor was associated with faster longitudinal memory decline than the temporal (*P* = 1e−4) and subcortical (*P* = 4e−6) factors among AD dementia patients. Among Aβ+ MCI participants, the subcortical factor was associated with slower decline rate than the cortical (*P* = 8e−4) and temporal (*P* = 4e−3) factors. Finally, among Aβ+ CN participants, the cortical factor showed slower memory decline than the temporal (*P* = 1e−4) and subcortical (*P* = 3e−3) factors.

For longitudinal change in executive function (Fig. 7*B*), the cortical factor was associated with faster executive function decline than temporal (*P* = 2e−3) and subcortical (*P* = 2e−4) factors among AD dementia patients. Among Aβ+ MCI participants, the subcortical factor had slower decline than the cortical (*P* = 8e−9) and temporal (*P* = 1e−8) factors. There was no executive function decline difference across the factors among Aβ+ CN participants.

All cognitive analyses were repeated using the two- and four-factor LDA atrophy factors (*SI Appendix*, Figs. S6 and S8). The results were consistent with the three-factor model (*SI Appendix*, *SI Results*). In addition, associations between minimental state examination (MMSE) and the three atrophy factors are reported in *SI Appendix*, Fig. S7*C*.

## Discussion

In this study, we identified distinct atrophy factors within AD dementia patients using Bayesian LDA modeling of MRI GM density maps. This approach estimated the factor composition of multiple atrophy factors for each participant rather than assuming membership to a single atrophy subtype (Fig. 1). Our analysis yielded a nested hierarchy of atrophy factors (Fig. 2), which corresponded to distinct trajectories of memory and executive function decline across the disease spectrum (Fig. 8). Overall, these results provide evidence that heterogeneity in patterns of atrophy exists in late-onset AD and that these atrophy patterns are associated with distinct cognitive trajectories.

### Atrophy Patterns in AD Dementia.

Our model revealed a hierarchy of atrophy patterns within AD dementia patients (Fig. 2). As the number of estimated atrophy factors was increased from *K* to *K* + 1, one atrophy pattern fractionated into two atrophy patterns, whereas the remaining patterns remained unchanged (*SI Appendix*, Fig. S2). It is noteworthy that the atrophy patterns extracted using *K* = 3 were similar to results from other groups investigating AD subtypes (7, 15, 16), although notable differences did emerge.

Specifically, our three-factor model revealed a temporal factor associated with atrophy in the temporal cortex, hippocampus, and amygdala; a cortical pattern associated with atrophy in the frontal, parietal, lateral temporal, and lateral occipital cerebral cortices; and a subcortical factor associated with atrophy in the cerebellum, striatum, and thalamus (Fig. 2*B*). Our temporal factor was similar to the previously described limbic-predominant subtype, whereas the cortical factor was similar to the hippocampal-sparing subtype (3, 7). More specifically, previous pathologically defined subtypes were identified based on the ratio of NFT burden in hippocampal subregions versus association cortex, resulting in a limbic-predominant subtype and a hippocampal-sparing subtype. Follow-up VBM analyses (7) suggested GM loss in the temporoparietal cortex, frontal cortex, insula, and precuneus in the hippocampal-sparing subtype, consistent with our cortical atrophy factor. However, Whitwell et al. (7) identified predominant atrophy in the medial temporal lobe of the limbic-predominant subtype, consistent with our temporal atrophy factor.

A benefit of our approach is that the nested hierarchy of atrophy patterns was not mandated by our model but completely data-driven. Thus, although not mandated, our results revealed a nested hierarchy in contrast with previous approaches where hierarchy was imposed (15). Specifically, Noh et al. (15) identified three subtypes: a “medial temporal” subtype, a “parietal frontal-dominant” subtype, and a “diffuse” subtype. Our temporal atrophy factor might correspond to their medial temporal subtype, whereas our cortical factor might correspond to their parietal frontal-dominant subtype, although direct comparison was difficult, because their analyses were restricted to the cerebral cortex.

Our model suggests that atrophy patterns in AD patients follow a nested hierarchy structure. Given the nested hierarchy of cognitive functions revealed by a recent large-scale metaanalysis of 10,000 brain imaging experiments (12) as well as brain network analyses (28⇓⇓–31), one might speculate that the nested hierarchy of atrophy factors arises from a natural hierarchy of brain functions and networks.

### Atrophy Factors Reflect Subtypes Rather than Disease Stages.

A potential pitfall of AD subtype analyses (32) is that the observed heterogeneity might correspond to different disease stages (stage hypothesis) rather than heterogeneity in disease expression (subtype hypothesis). There are various reasons why the atrophy factors discussed in this manuscript likely correspond to subtypes rather than disease stages (33). First, there was not a single factor associated with the worst memory and executive function. Instead, decline trajectories of the temporal and cortical factors varied in their associations with the two cognitive domains (Fig. 8). Furthermore, analysis of follow-up MRI scans revealed that factor compositions were stable over time (Fig. 3), suggesting that individuals were not progressing from one factor to another [e.g., from temporal factor to cortical factor as predicted under the Braak staging scheme (34)].

### Factor-Dependent Characteristics.

There were significant differences across the atrophy factors in baseline age (*P* = 8e−7) and age at AD onset (*P* = 1e−5). Baseline age is dependent on study design, and therefore, drawing meaningful comparisons with the literature is difficult. Nevertheless, the cortical factor was associated with younger age at AD onset, consistent with previous studies describing subtypes with predominant cortical atrophy (3, 8, 15). Importantly, years from AD onset to baseline did not differ across the three latent factors (*P* = 0.29) (*SI Appendix*, Table S2), providing additional evidence that these factors were not simply disease stages. The subcortical factor was associated with a higher prevalence of the APOE ε2 allele (*P* = 0.03; not significant when corrected for multiple comparisons). The protective effects of the ε2 allele (35) might potentially contribute to the observation that the subcortical factor was associated with the mildest decline in both memory and executive function across all stages (Fig. 8).

Importantly, a lack of association between each factor and amyloid status suggests that atrophy factors do not merely reflect patterns associated with non-AD dementia patients who may have been “misdiagnosed” as AD dementia within the ADNI dataset (36). However, although repeating our factor estimation with Aβ+ AD dementia patients revealed consistent atrophy patterns with the model using all AD patients, we are not able to determine whether atrophy patterns are a result of Aβ pathology or precede Aβ pathology. For instance, these atrophy patterns may emerge through processes not directly linked to Aβ pathology but instead, converge with AD pathology to influence disease progression. It is possible that factors, such as comorbid TDP-43 pathology and genetics, as well as development differences contribute to this heterogeneity. Along these lines, recent work suggests that different pathologies have distinct impacts on cognitive trajectories (37). Interestingly, TDP-43 was shown to have a very early impact on cognitive trajectories compared with other pathologies, such as hippocampal sclerosis and Lewy bodies. Given that TDP-43 is known to impact the medial temporal lobe (6), it is possible that the temporal atrophy factor is influenced by the involvement of this pathology (because the temporal factor shows an early impact on memory among Aβ+ CN in our study).

A fundamental question that remains is why the expression of these atrophy patterns varies across individuals, especially because the spatial distribution of Aβ tends to be very diffuse throughout cortex. A similar dissociation is observed among AD patients with atypical clinical presentations, such that, although the spatial pattern of Aβ is diffuse, the underlying pattern of NFTs and GM atrophy aligns with clinical symptoms (4). Future work should investigate the time course of these atrophy patterns using longitudinal MRI as well as longitudinal assessment of Aβ and also investigate the prevalence of atrophy patterns among Aβ− participants to understand whether these patterns are specific for AD or merely converge with AD processes to influence disease progression.

### Distinct Memory and Executive Function Decline Trajectories.

The behavioral results (Figs. 5, 6, and 7) are summarized in Fig. 8. Overall, we found that the associations between atrophy factors and cognition varied by domain as well as time course in the disease. Specifically, the temporal factor showed the greatest association with memory, a relationship that emerged early among Aβ+ CN participants and remained consistent in later disease stages. Conversely, the cortical factor was associated with both memory and executive function but exerted greater impact later in the disease among Aβ+ MCI participants and AD patients.

Overall, the trajectories (Fig. 8) revealed several salient points. First, memory decline in the context of late-onset AD occurred earlier than decline in executive function, which is in line with previous studies (38). Second, divergence of memory trajectories among atrophy factors appeared as early as the asymptomatic (CN) stage of the disease, whereas divergence of executive function trajectories was not detectable until the MCI stage (Fig. 8). Specifically, the temporal and subcortical factors showed faster memory decline than the cortical factor among Aβ+ CN participants, and by MCI, the temporal factor was already associated with worse memory at baseline than the subcortical factor. In contrast, there was no difference in executive function decline rates among Aβ+ CN participants or cross-sectional difference among Aβ+ MCI participants. Interestingly, AD dementia patients expressing the cortical factor exhibited the fastest decline rates in both executive function and memory. Third, the subcortical factor (blue curves in Fig. 8) was the mildest factor in terms of both memory and executive function deterioration. In both Aβ+ MCI and AD dementia participants, the subcortical factor was associated with the best memory and executive function scores as well as the slowest decline rates.

### Correspondence and Extensions of AD Heterogeneity Literature.

Our results were consistent with the preponderance of literature on heterogeneity among AD dementia patients. For example, our atrophy factors show overlap with the pathologically defined hippocampal-sparing and limbic-predominant subtypes (7) as well as the subtypes described by Noh et al. (15). Our analyses suggested that the cortical factor was associated with faster decline in both memory and executive function than the temporal factor at the dementia stage, which is consistent with the hippocampal-sparing subtype exhibiting faster MMSE decline than the limbic-predominant subtype among AD dementia patients (3). Similarly, our finding that the cortical factor was associated with the most rapid memory and executive function decline among AD dementia patients was also consistent with the work by Byun et al. (16). Among AD dementia patients, the cortical factor was associated with the worst baseline executive function, whereas the temporal factor was associated with the worst baseline memory. This result is consistent with previous work showing that thinning of frontoparietal cortical regions was associated with nonamnestic presentations and dysexecutive phenotypes (9) and that the “cortical atrophy-only” subtype had worse baseline executive function than the “hippocampal atrophy-only” subtype (16). Thus, our data-driven approach provides additional evidence that distinct atrophy patterns among AD patients impact different cognitive domains.

In addition to characterizing heterogeneity among AD dementia patients, we extended our approach to participants who were presumably in very early stages of AD development (i.e., Aβ+ but without the clinical symptoms of dementia) (19, 20). By examining earlier stages, we found that the temporal factor showed the greatest association with memory decline among Aβ+ CN participants but that the cortical factor was a stronger predictor of memory decline among AD dementia patients (*SI Appendix*, Fig. S7*A**2*). Likewise, although the cortical factor was not associated with either cognitive domain among Aβ+ CN participants, this factor was associated with executive function decline in Aβ+ MCI participants and AD patients (Figs. 7*B* and 8). The impact of these atrophy factors at different points along the clinical spectrum has important implications for measuring decline and understanding the progression of AD. Furthermore, consideration of this heterogeneity may improve the ability to identify individuals most at risk for cognitive decline compared with approaches that measure atrophy using the same regional metric across all participants.

### Mixed Membership Modeling and Precision Medicine.

One key advantage of our modeling strategy is that individuals can express multiple latent atrophy factors (i.e., mixed membership) rather than being assigned to a single subtype. Therefore, patients classified by Murray et al. (3) as hippocampal-sparing (or limbic-predominant) might correspond to the few patients in our study who predominantly expressed the cortical (or temporal) atrophy factor. Murray et al. (3) also defined a third group of patients who were considered “typical” by virtue of being neither hippocampal-sparing nor limbic-predominant. These typical patients might correspond to the majority of AD dementia patients in our study who expressed multiple latent factors to similar degrees.

The use of mixed membership modeling has implications for estimation of factor-dependent atrophy maps and cognitive decline. For example, consider a hypothetical patient who expressed 50% subcortical, 40% temporal, and 10% cortical factors. In our analyses, 50%, 40%, and 10% of the patient’s atrophy map would contribute to the estimation of the probabilistic atrophy maps of the subcortical, temporal, and cortical factors, respectively. This method extends previous approaches (7, 15, 16) that classified each patient into one single subtype and then, performed group comparisons to obtain differential atrophy patterns, despite the fact that each patient might express multiple latent atrophy factors. Thus, more information about each participant is retained by treating factor compositions continuously rather than assigning participants to a single group.

Similarly, 50%, 40%, and 10% of the hypothetical patient’s cognitive decline rate would contribute to our estimation of the memory decline rates associated with the subcortical, temporal, and cortical factors, respectively. Indeed, when such a patient was simply assigned to a single factor based on the highest probability (i.e., assigned to a pure subtype), the estimated differences in cognitive decline rates across subtypes were found to be substantially weaker. The reason should be clear when considering the hypothetical patient. Because the patient expressed 50% subcortical, 40% temporal, and 10% cortical factors, one would expect the memory decline rate to be faster than a pure subcortical subtype (and slower than a pure temporal factor). By assigning this patient to be a pure subcortical subtype, one would overestimate the decline rate of the subcortical subtype.

Although we observe some participants with extreme probabilities of a single atrophy factor, these participants are infrequent. Instead, the majority of the participants expressed intermediate probabilities across multiple latent atrophy factors. We can potentially use the factor decomposition to predict the memory and executive function decline trajectories of individual participants. For example, we might predict the hypothetical patient who expressed 50% subcortical, 40% temporal, and 10% cortical factors to have decline trajectories corresponding to 50% times the blue curve plus 40% times the green curve plus 10% times the red curve from Fig. 8. Therefore, the factor composition can be thought of as an individualized subtype diagnosis of the participant, representing a small but crucial step toward precision medicine.

### Limitations.

Our study has multiple limitations. First, direct comparisons with other subtype studies were difficult because of methodological differences, including the utilization of mixed membership modeling and participant selection. Another limitation is the arbitrary choice of the number of latent atrophy factors to estimate using LDA. Given consistency with previous studies and a limited sample size, we focused on *K* = 2–4 factors, but atrophy factors beyond *K* = 4 may be biologically relevant.

## Conclusion

By using a Bayesian modeling framework, our study revealed three latent AD atrophy factors with distinct memory and executive function trajectories. Across the clinical spectrum, the cortical atrophy factor was associated with the worst executive function performance, whereas the temporal atrophy factor was associated with the worst memory performance. The subcortical atrophy factor has not been discussed in the literature and was associated with the slowest memory and executive function decline. Our approach allowed each individual to express multiple atrophy factors to various degrees rather than assigning the individual to a single subtype. Therefore, each participant exhibited his or her own unique factor composition, which can potentially be exploited to predict individual-specific cognitive decline trajectories, with potential implications for prevention and monitoring disease progression. Finally, our methodological framework is general and can be used to discover subtypes in other brain disorders. Factor compositions of ADNI participants and code used in this manuscript are publicly available (https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/disorder_subtypes/Zhang2016_ADFactors).

## Materials and Methods

### Overview.

Voxelwise atrophy of 188 AD dementia patients was derived from their structural MRI data (22, 39). Subsequent analyses proceeded in three steps. In step I, a Bayesian model (Fig. 1) (10) was applied to estimate the probabilistic atrophy maps of latent factors Pr(Voxel | Factor) and the factor composition of each patient Pr(Factor | Patient). The probabilistic atrophy maps were then used to infer the factor compositions of 43 Aβ+ CN participants and 147 Aβ+ MCI participants. In step II, stability of the factor decomposition over a period of two years was analyzed. In addition, characteristics (demographics, age at AD onset, years from AD onset to baseline, amyloid burden, and APOE genotype) of all participants were compared across the factors. Finally, in step III, we analyzed the atrophy factors’ relationships with cross-sectional baseline and longitudinal decline of memory and executive function. Each step is described in detail below.

### Data.

Data used in this study were obtained from the ADNI database (adni.loni.usc.edu), which was launched in 2003 as a public–private partnership and led by Principal Investigator Michael W. Weiner. The primary goal of the ADNI has been to test whether serial MRI, PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD (up to date information is at www.adni-info.org/). Institutional review boards approved study procedures across participating institutions (the complete list of the institutions is in *SI Appendix*). Written informed consent was obtained from all participants.

This study considered the structural MRI (T1-weighted, 1.5 T) of 810 participants enrolled in the ADNI 1, comprising 188 AD dementia (at baseline; same hereinafter) patients, 394 MCI participants, and 228 CN participants. Of the 188 AD dementia patients, 100 had their CSF amyloid data available, and 91 of 100 were Aβ+. AD onset was, on average, 3.6 years (SD = 2.5, minimum = 0, maximum = 13) before baseline. Of 394 MCI participants, 197 had their CSF amyloid data available, and 147 of 197 were Aβ+. Of 228 CN participants, 114 had their CSF amyloid data available, and 43 of 114 were Aβ+. The Aβ+ CN elderly participants and the Aβ+ MCI participants are referred to as the Aβ+ nondemented group (*n* = 190) in this study.

According to the ADNI protocol, AD dementia patients had their cognition examined at baseline and in months 6, 12, and 24. In addition, normal participants were examined in month 36 and annually afterward. MCI participants underwent another extra examination in month 18. Although this study only considered participants enrolled in the ADNI 1, to increase statistical power, their neuropsychological scores (ADNI-Mem, ADNI-EF, and MMSE) from ADNI Grand Opportunity (GO) and ADNI 2 were also included in the longitudinal analyses of cognitive decline.

### Voxel-Based Morphometry.

Structural MRI data of all 810 participants were analyzed with FSL-VBM (fsl.fmrib.ox.ac.uk/fsl/fslwiki/FSLVBM) (22), a VBM protocol (40) carried out with FSL tools (41). First, structural images were brain-extracted and GM-segmented before being registered to the Montreal Neurological Institute (MNI152) standard space using affine registration. Second, the affine-registered images were flipped about the *x* axis and averaged to create a left–right symmetric, study-specific affine GM template. Third, the GM images were nonlinearly registered to the affine GM template, and again, they were flipped and averaged to create a final left–right symmetric, study-specific nonlinear GM template in MNI152 space. Fourth, all native GM images were nonlinearly registered to this final template and modulated to account for local expansion (or contraction) because of the nonlinear component of the spatial transformation. The resulting GM density images were smoothed with a Gaussian kernel of 10-mm FWHM, consistent with standard VBM practices (42, 43). Finally, we applied log_{10} to the smoothed GM density images and regressed out possible effects of age, sex, and intracranial volume (ICV) with a GLM estimated from just 228 CN participants.

### Quality Control for Voxel-Based Morphometry.

The outputs of each VBM step were visually checked by authors X.Z. and N.S. Details are found in *SI Appendix*, *SI Methods*.

### Bayesian Model.

We sought a mathematical model that captured the premise that each AD patient expresses one or more latent atrophy factors, each of which is associated with distinct but possibly overlapping atrophy patterns (Fig. 1). Among many possible models, the LDA model (10) is probably the simplest and was applied to the ADNI data.

The LDA model was originally developed to automatically discover latent topics in a collection of text documents. The model assumes that each document is an unordered collection of words associated with a subset of *K* latent topics. Each topic is represented by a probability distribution over a dictionary of words. Given a collection of documents, there exist algorithms (10) to estimate the probability of a dictionary word given a topic Pr(Word | Topic) and the probability that a topic is associated with a particular document Pr(Topic | Document). The LDA model is useful, because it allows a document to be associated with multiple topics (which can be shared across documents) and each topic to be associated with multiple words (which can be shared across topics).

To map the LDA model to the ADNI data, one can think of AD patients as text documents, atrophy factors as topics, and MNI152 voxels as dictionary words. Correspondingly, each patient expresses one or more latent atrophy factors to different extents [Pr(Factor | Patient)], and each factor is associated with atrophy at multiple voxels to different extents [Pr(Voxel | Factor)].

LDA assumes that a document is summarized by the number of times that a dictionary word appears in the document. Because dictionary words correspond to MNI voxels, the continuous log-transformed GM density images (in the previous section) were discretized, so that greater atrophy corresponded to larger word counts. More specifically, for each voxel of the log-transformed GM density images, *z* transformation (with respect to 228 CN participants) was performed for each of 810 participants. Therefore, a *z* score of <0 at a given voxel of a particular individual would imply above-average atrophy at the voxel relative to the CN participants. *z* Scores above zero were set to zero, equivalent to regarding the voxels as atrophy-free. Finally, the *z* scores were multiplied by −10 and rounded to the nearest integer, so that larger positive values (greater word count) indicated more severe atrophy.

The LDA model assumes that the ordering of words within a document is exchangeable. In the context of our application, the corresponding assumption is that the ordering of atrophied voxels is exchangeable. Although word order in real documents is important, the ordering of atrophied regions (e.g., prefrontal vs. parietal) reported in an experiment is arbitrary and thus, consistent with the assumption. Consequently, the LDA model appears particularly well-suited for applications in this context.

Given the discretized voxelwise atrophy of 188 AD dementia patients and the number of latent atrophy factors *K*, the VEM algorithm (www.cs.princeton.edu/∼blei/lda-c/) (10) was applied to estimate Pr(Factor | Patient) and Pr(Voxel | Factor). For each *K*, the algorithm was rerun with 40 different random initializations, and the solution with the highest likelihood (bound) was selected. The random initializations led to highly similar solutions, suggesting that 40 random initializations were sufficient for robust factor estimations.

The probabilistic atrophy maps Pr(Voxel | Factor) estimated from the AD dementia patients were used to infer factor compositions Pr(Factor | Participant) of 190 Aβ+ nondemented participants using the standard VEM algorithm (10).

### Interpreting Pr(Voxel | Factor) and Pr(Factor | Patient).

For a given latent factor, Pr(Voxel | Factor) is a probability distribution over all of the GM voxels, which can be visualized as a probabilistic atrophy map overlaid on the FSL MNI152 template (each row of Fig. 2).

Pr(Factor | Patient) is a probability distribution over latent atrophy factors, representing the factor composition of the patient, and can be visualized as a dot inside a “factor triangle” (for *K* = 3 factors) with barycentric coordinates that equal Pr(Factor | Patient) as shown in Fig. 4 and *SI Appendix*, Fig. S5*A*. For example, Pr(Factor | Patient) = [0.7, 0.2, 0.1] implies that the patient expresses a pattern of brain atrophy caused by 70% temporal, 20% subcortical, and 10% cortical factors, respectively, and that the dot representing this patient falls closer to the “temporal corner” of the factor triangle. This approach contrasts with work in the literature that assigns each individual to a single subtype (3, 15, 16).

### Quantifying the Nested Hierarchy of Atrophy Factors.

An important model parameter is the number of latent factors *K*. Therefore, we determined how factor estimation changed from *K* = 2 to 10 factors. The detailed description is in *SI Appendix*, *SI Methods*.

### Top Anatomical Structures Associated with Each Factor.

This manuscript focuses on three atrophy factors. To automatically identify the GM anatomical structures most associated with each atrophy factor, the MNI152 template was first processed using FreeSurfer 4.5.0 (24). The FreeSurfer software automatically segmented the MNI152 template into multiple cortical (44, 45) and subcortical (44, 46) structures, such as the inferior parietal cortex and hippocampus. For each anatomical structure, we averaged Pr(Voxel | Factor) over all of its voxels. The structure was assigned to the factor with the largest average probability. For each factor, we tabulated the assigned brain structures and ranked them in the descending order of average probability. The results are in *SI Appendix*, Table S1.

### Cross-Pipeline Validation of Atrophy Patterns.

To ensure that the atrophy factors were robust to choice of VBM software (FSL), we performed post hoc analyses using FreeSurfer. Details are found in *SI Appendix*, *SI Results* and *SI Methods*.

### Atrophy Factor Stability.

To examine the atrophy factor stability during disease progression, we considered all 810 participants who had their two-year follow-up scans available (*n* = 560). First, their baseline factor compositions Pr(Factor | Participant) were extracted using their baseline MRI data. Second, VBM was performed on the follow-up structural MRI data using the VBM template previously created with all 810 participants. Subsequent processing (e.g., *z* normalization) adopted parameters used in processing 810 baseline scans. Factor compositions were then inferred with the processed VBM results (same procedure as inferring factor compositions of Aβ+ CN and MCI participants). The factor stability was visualized with a scatter plot for each factor (Fig. 3 and *SI Appendix*, Fig. S4). Each participant is represented by a dot with an *x* coordinate that is the factor composition at baseline and a *y* coordinate that is the factor composition after two years. Therefore, if the factor estimation is stable over disease progression, one would expect a close-to-one correlation coefficient and a *y* = *x* linear fit.

### Comparing Patient Characteristics by Atrophy Factor.

We explored how patient characteristics (baseline age, age at AD onset, years from onset to baseline, education, sex, amyloid, and APOE genotype) varied across the three latent factors (*SI Appendix*, Table S2) using GLM (and logistic regression for binary variables).

GLM was applied to baseline age, age at AD onset, years from onset to baseline, education, amyloid, and APOE: the characteristic of interest served as response *y*, and the subcortical factor probability *s* and cortical factor probability *c* were included as explanatory variables. Hence, the GLM was *y* = β_{0} + β_{s}·*s* + β_{c}·*c* + ε, where β indicates the regression coefficients, and ε is the residual. The temporal factor probability *t* was implicitly modeled, because *t* + *s* + *c* = 1. Intuitively, β_{0} reflected the response of the temporal factor, β_{s} reflected the response difference between the subcortical and temporal factors, and β_{c} reflected the difference between the cortical and temporal factors.

Statistical tests of whether the characteristic *y* varied across factors involved null hypotheses of the form *H*β = 0, where β = [β_{0}, β_{s}, β_{c}]^{T}, and *H* is the linear contrast (47). We first performed a statistical test of overall differences across all factors with *H* = [0, 1, 0; 0, 0, 1]. We then tested for differences between the factors. For example, *H* = [0, 1, −1] tested possible differences between the subcortical and cortical factors. *H* = [0, 1, 0] compared the subcortical and temporal factors. Similarly, *H* = [0, 0, 1] compared the cortical and temporal factors.

Because sex is a binary variable, logistic regression was applied. In this case, response *y* was sex (zero for male, and one for female), and explanatory variables consisted of the subcortical factor probability *s* and cortical factor probability *c*. Therefore, the regression model was log(µ/(1 − µ)) = β_{0} + β_{s}·*s* + β_{c}·*c* + ε, where µ is the probability of female, β indicates the regression coefficients, and ε is the residual. Intuitively, the linear combination β_{0} + β_{s}·*s* + β_{c}·*c* predicts the probability of female (*y* = 1); exp(β_{0}) reflects the odds ratio for the temporal factor, exp(β_{s}) reflects the ratio of odds ratio between the subcortical and temporal factors, and exp(β_{c}) reflects the ratio of odds ratio between the cortical and temporal factors.

Likelihood ratio test was used to determine whether sex varied across the latent atrophy factors. In short, the test involved comparing the likelihood of an appropriately restricted model with the original model (47). We first performed a statistical test of overall differences across factors. In this case, the restricted model log(µ/(1 − µ)) = β_{0} + ε was fitted to the data, and the resulting likelihood was compared with the likelihood of the original model log(µ/(1 − µ)) = β_{0} + β_{s}·*s* + β_{c}·*c* + ε. We then tested for possible differences between atrophy factors. For example, to compare the subcortical and cortical factors, the restricted model was log(µ/(1 − µ)) = β_{0} + β_{s}·(*s* + *c*) + ε, because β_{s} = β_{c} under the null hypothesis. To compare the subcortical and temporal factors, the restricted model became log(µ/(1 − µ)) = β_{0} + β_{c}·*c* + ε, because β_{s} = 0 under the null hypothesis. To compare the cortical and temporal factors, the restricted model was log(µ/(1 − µ)) = β_{0} + β_{s}·*s* + ε, because β_{c} = 0 under the null hypothesis.

### General Linear Modeling of Cross-Sectional Cognition Among Aβ+ CN, Aβ+ MCI, and AD Dementia Participants.

A single GLM was used to examine cross-sectional differences in memory (ADNI-Mem) (25) across the atrophy factors in 43 Aβ+ CN, 147 Aβ+ MCI, and 188 AD dementia participants. The same model was estimated for *K* = 2, 3, and 4 factors as well as for executive function (ADNI-EF) (26) and MMSE.

For ease of explanation, we will focus on explaining the GLM for the case of three atrophy factors and ADNI-Mem. Response *y* of the GLM consisted of 378 (=43 CN + 147 MCI + 188 AD) participants’ baseline ADNI-Mem. Explanatory variables consisted of binary MCI group indicator *m*, binary AD dementia group indicator *d*, subcortical factor probability *s*, cortical factor probability *c*, and interactions between group indicators and factor probabilities (i.e., *m*·*s*, *m*·*c*, *d*·*s*, and *d*·*c*), whereas nuisance variables consisted of baseline age *x*_{1}, sex *x*_{2}, education *x*_{3}, and total atrophy *x*_{4} (defined as ICV divided by total GM volume as estimated by FSL).

Therefore, the GLM was *y* = β_{0} + β_{m}·*m* + β_{d}·*d* + β_{s}·*s* + β_{c}·*c* + β_{ms}·*m*·*s* + β_{mc}·*m*·*c* + β_{ds}·*d*·*s* + β_{dc}·*d*·*c* + β_{1}·*x*_{1} + β_{2}·*x*_{2} + β_{3}·*x*_{3} + β_{4}·*x*_{4} + ε, where β indicates the regression coefficients, and ε is the residual. Temporal factor probability *t* was implicitly modeled, because *t* + *s* + *c* = 1. Similarly, the CN group indicator *n* was also implicitly modeled, because only one of *n*, *m*, and *d* is one, with the other two being zero. Intuitively, β_{0} reflected the temporal factor’s contribution to ADNI-Mem at the CN baseline (because *m* = *d* = *s* = *c* = 0), β_{0} + β_{m} reflected the temporal factor’s contribution to ADNI-Mem at the MCI baseline (because *m* = 1 and *d* = *s* = *c* = 0), and β_{0} + β_{m} + β_{s} + β_{ms} reflected the subcortical factor’s contribution to ADNI-Mem at the MCI baseline (because *m* = *s* = 1 and *d* = *c* = 0). With this model setup, variations in age, sex, education, and total atrophy were controlled for across participants.

Statistical tests involved null hypotheses of the form *H*β = 0, where β = [β_{0}, β_{m}, β_{d}, β_{s}, β_{c}, β_{ms}, β_{mc}, β_{ds}, β_{dc}, β_{1}, β_{2}, β_{3}, β_{4}]^{T}, and *H* is the linear contrast (47). We tested whether ADNI-Mem deteriorated across disease stages (i.e., from CN to MCI to AD) for each factor. Specifically, for each factor, we tested possible differences in ADNI-Mem between the CN and MCI baselines, between the MCI and AD baselines, and between the CN and AD baselines. For example, to test whether ADNI-Mem deteriorated significantly from the CN to MCI baseline for the temporal factor, *H* was specified, such that *H*β = β_{m} = 0. As another example, *H*β = β_{d} + β_{dc} – β_{m} – β_{mc} = 0 tested whether ADNI-Mem degraded greatly from the MCI to AD baseline for the cortical factor. The test results for both memory and executive function are tabulated in Fig. 5 *A**1* and *B**1*.

To foreshadow the results, the hypothesis tests in the previous paragraph hinted at differences in cross-sectional ADNI-Mem across the factors. Therefore, statistical tests of whether cross-sectional ADNI-Mem *y* varied across factors at each disease stage were performed. For each stage baseline, we first performed a statistical test of overall differences across all factors and then tested for pairwise differences. Take the AD baseline as an example. To test whether baseline memory differed across all factors among AD dementia patients, *H* was specified, such that *H*β = 0 translated to β_{s} + β_{ds} = β_{c} + β_{dc} = 0. For pairwise comparisons, β_{s} + β_{ds} = 0 tested possible differences between the temporal and subcortical factors at the AD baseline, β_{c} + β_{dc} = 0 compared the temporal and cortical factors at the AD baseline, and β_{s} + β_{ds} = β_{c} + β_{dc} tested possible differences between the subcortical and cortical factors at the AD baseline.

The results of the above statistical tests are shown in Figs. 5 *A**1* and *B**1* and 6 and *SI Appendix*, Figs. S6 *A**1* and *B**1*; S7 *A**1*, *B**1*, and *C**1*; and S8 *A**1* and *B**1*, where (except in Figs. 5 *A**1* and *B**1*) the blue dots correspond to the estimated difference in baseline scores between two “pure factors” after controlling for age, sex, education, and total atrophy. For example, when comparing subcortical and cortical factors at the MCI baseline, the estimated difference in baseline cognition is given by β_{s} + β_{ms} – β_{c} – β_{mc}. The red bars correspond to the SE of this estimation given by SD(β_{s} + β_{ms} – β_{c} – β_{mc}).

### LME Modeling of Longitudinal Cognitive Decline Among Aβ+ CN, Aβ+ MCI, and AD Dementia Participants.

To analyze variations in cognitive decline rates across atrophy factors, we used the LME model, which had a setup that was similar to the GLM setup (in the previous section). Details are found in *SI Appendix*, *SI Methods*. Results of the LME statistical tests are illustrated in Figs. 5 *A**2* and *B**2* and 7 and *SI Appendix*, Figs. S6 *A**2* and *B**2*; S7 *A**2*, *B**2*, and *C**2*; and S8 *A**2* and *B**2*.

### False Discovery Rate Correction for Behavioral Tests.

Because of the many statistical tests performed in the behavioral analyses, multiple testing was corrected using false discovery rate (FDR) (48) at *q* = 0.05 for all behavioral comparisons. In detail, included tests are diagnostic group comparisons in memory and executive function regardless of factors as well as all comparisons of baseline and longitudinal decline rates of memory, executive function, and MMSE at all disease stages for *K* = 2, 3, and 4 factors. In total, we corrected for 240 statistical tests. *P* values that remained significant after FDR control were highlighted in blue in Figs. 5, 6, and 7 and *SI Appendix*, Figs. S6–S8.

## Acknowledgments

We thank Maxwell Bertolero, Mark D'Esposito, Rik Ossenkoppele, Daniel Alexander, and Christopher Asplund for their constructive comments as well as Gia Ngo, Muhammad Anwar, Ryan Fong, and Xilin Jiang for assistance with public release of code and data. This work was supported by National University of Singapore (NUS) Tier 1; Singapore Ministry of Education Tier 2 Grant MOE2014-T2-2-016; NUS Strategic Research Grant DPRT/944/09/14; NUS School of Medicine Aspiration Fund R185000271720; Singapore National Medical Research Council Grant CBRG14nov007, NMRC/CG/013/2013; NUS Young Investigator Award; and NIH Grants 1K25EB013649-01, 1R21AG050122-01A1, P01AG036694, and F32AG044054. The research also used resources provided by Center for Functional Neuroimaging Technologies Grant P41EB015896 and instruments supported by Grants 1S10RR023401, 1S10RR019307, and 1S10RR023043 from the Athinoula A. Martinos Center for Biomedical Imaging at the Massachusetts General Hospital. Data collection and sharing for this project was funded by the ADNI (NIH Grant U01 AG024904) and the Department of Defense (DOD) ADNI (DOD Grant W81XWH-12-2-0012). The ADNI is funded by the National Institute on Aging and the National Institute of Biomedical Imaging and Bioengineering and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. The ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. Data used in preparation of this article were obtained from the ADNI database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: thomas.yeo{at}nus.edu.sg.

Author contributions: X.Z., E.C.M., R.A.S., M.R.S., and B.T.T.Y. designed research; X.Z. and N.S. performed research; X.Z., M.R.S., and A.D.N.I. contributed new reagents/analytic tools; A.D.N.I. contributed to design and implementation of the Alzheimer's Disease Neuroimaging Initiative; X.Z. and N.S. analyzed data; and X.Z., E.C.M., M.R.S., and B.T.T.Y. wrote the paper.

The authors declare no conflict of interest.

A complete list of the Alzheimer's Disease Neuroimaging Initiative can be found in

*SI Appendix*.This article is a PNAS Direct Submission.

Data deposition: The Alzheimer's Disease Neuroimaging Initiative data are publicly available at adni.loni.usc.edu/.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1611073113/-/DCSupplemental.

Freely available online through the PNAS open access option.

## References

- ↵
- ↵
- ↵
- ↵.
- Ossenkoppele R, et al.

- ↵
- ↵
- ↵
- ↵
- ↵.
- Dickerson BC,
- Wolk DA, Alzheimer’s Disease Neuroimaging Initiative

- ↵
- ↵
- ↵.
- Yeo BTT, et al.

- ↵.
- Bertolero MA,
- Yeo BTT,
- D’Esposito M

- ↵
- ↵
- ↵
- ↵.
- Scheltens NM, et al.

- ↵
- ↵
- ↵
- ↵
- ↵.
- Douaud G, et al.

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Bassett DS, et al.

- ↵
- ↵.
- Yeo BTT, et al.

- ↵
- ↵.
- Young AL, et al., Alzheimer’s Disease Neuroimaging Initiative

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Koch KR

- ↵.
- Benjamini Y,
- Hochberg Y