# Gaussian process linking functions for mind, brain, and behavior

See allHide authors and affiliations

Edited by Nikolaus Kriegeskorte, Columbia University, New York, NY, and accepted by Editorial Board Member Dale Purves June 17, 2020 (received for review November 5, 2019)

## Abstract

The link between mind, brain, and behavior has mystified philosophers and scientists for millennia. Recent progress has been made by forming statistical associations between manifest variables of the brain (e.g., electroencephalogram [EEG], functional MRI [fMRI]) and manifest variables of behavior (e.g., response times, accuracy) through hierarchical latent variable models. Within this framework, one can make inferences about the mind in a statistically principled way, such that complex patterns of brain–behavior associations drive the inference procedure. However, previous approaches were limited in the flexibility of the linking function, which has proved prohibitive for understanding the complex dynamics exhibited by the brain. In this article, we propose a data-driven, nonparametric approach that allows complex linking functions to emerge from fitting a hierarchical latent representation of the mind to multivariate, multimodal data. Furthermore, to enforce biological plausibility, we impose both spatial and temporal structure so that the types of realizable system dynamics are constrained. To illustrate the benefits of our approach, we investigate the model’s performance in a simulation study and apply it to experimental data. In the simulation study, we verify that the model can be accurately fitted to simulated data, and latent dynamics can be well recovered. In an experimental application, we simultaneously fit the model to fMRI and behavioral data from a continuous motion tracking task. We show that the model accurately recovers both neural and behavioral data and reveals interesting latent cognitive dynamics, the topology of which can be contrasted with several aspects of the experiment.

In the present technological state, the mind remains a latent construct that cannot be directly observed. However, clever experimental designs as well as technological advancements have enabled the collection of many important manifest variables such as response time, the blood oxygenated level-dependent (BOLD) response in functional MRI (fMRI), and the electroencephalogram (EEG). Each of these measures ostensibly reflects signatures of mental operations, potentially revealing insight into latent cognitive dynamics. Despite these innovations, the key to understanding what the mind is and ultimately how it produces complex patterns of thought and decisions lies in our ability to explain the structure between manifest variables of the brain and manifest variables of behavior. The set of possible links connecting these variables is known as linking propositions (1⇓⇓–4).

Linking propositions have been a productive route forward because they facilitate quantitative assessments of the contribution of physiological variables to psychological processes. For example, Teller (2) devised a set of linking propositions by specifying logical relations among physiological variables and psychological states. Teller’s families of linking propositions were specified to be axiomatic in the sense that they relied on strict equality statements, which are impossible to satisfy in the context of measurement noise (3). Recognizing these practical limitations, Schall (3) developed the concept of statistical linking functions, where probability distributions could replace axiomatic statements to accommodate the uncertainty associated with manifest variables. Finally, Forstmann et al. (5, 6) concretized statistical linking functions by performing null hypothesis tests among patterns of neural data and the parameters of computational models. Later, Forstmann et al. (7) further articulated the concept of reciprocity in linking functions, where latent representations of the mind (i.e., as instantiated by computational models) could inform the analysis of manifest variables of brain and behavior.

There now exist many techniques for imposing linking functions, varying along several dimensions such as the manner in which they impose directionality (e.g., neural data constrain a computational model, a computational model guides analysis of neural data, or equal reciprocity between all variables), the manner in which the mind is represented (e.g., a latent representation or set of transformation equations), and the complexity of their implementation (8). One promising framework that exploits the aforementioned concepts of statistical linking and reciprocity is the joint modeling framework (9) shown in Fig. 1. Constraints about the manifest variables, such as structural connectivity or experimental design, are used to specify the structure of a generative model. In turn, the generative model is specified to simultaneously describe all available manifest variables, regardless of their modality. The manner in which the latent states are connected to the manifest variables forms the linking function. By fitting the model to data using hierarchical Bayesian methods (10), one can infer the most plausible set of linking functions that connect the manifest variables.

Although joint models have proved effective in linking subject-specific (e.g., structural connectivity; ref. 9), modality-specific (e.g., combining fMRI and EEG; ref. 11), and trial-specific (e.g., single-trial BOLD response; refs. 12refs –refs. 14) information, so far all joint modeling applications have imposed a linearity assumption among the latent variables. While linearity may be a reasonable assumption in some cases, because the brain is highly dynamic (15), this assumption is surely violated in many realistic settings. In this article, we expand the general structure used in joint modeling with Gaussian processes to eliminate the specification of a particular parametric functional form of the linking function. This specification enables potentially complex, nonlinear linking functions to emerge directly from data.

## Gaussian Process as a Linking Function

To allow for flexibility in the linking function within the joint modeling approach, we use a Gaussian process (GP) to model the latent variables that represent the cognitive states underlying observed neural and behavioral data. In so doing, this Gaussian process joint modeling (GPJM) framework can be viewed as a temporal and nonparametric extension of the covariance-based linking function (8, 14). Gaussian processes have proved effective on a variety of statistical and machine-learning problems, including problems in fMRI analyses (16, 17). In this section, we provide a brief conceptual introduction to Gaussian processes and the GPJM.

### Gaussian Process Regression.

Our goal is to find a function that best describes the data assuming Gaussian noise,

On the contrary, Gaussian process regression directly models the function f as a sample from a distribution over functions. Specifically, the distribution over functions is assumed to be a multivariate Gaussian distribution with mean m and covariance matrix k, both defined as a function of input vector x (18, 19). Given an input vector

In a Gaussian process with noisy observations assuming Gaussian-distributed error, mean predictions about unobserved input

Some popular kernels are Matérn class, radial basis function (RBF), and rational quadratic (RQ) kernels. Each kernel generates different types of functions based on their functional form. For example, Matern 1/2 kernels generate nondifferentiable functions, whereas RBF kernels generate infinitely differentiable, smooth functions. Hyperparameters (which typically include a scaling factor often called “variance”) of the kernel also affect the characteristic evaluating covariance across input points.^{†} In particular, all of the kernels introduced above have a hyperparameter called “length scale” that scales the distance between two points. This hyperparameter plays an important role in determining the influence or “relevance” of each latent dimension in generating data, which is discussed later.

### Gaussian Process Joint Models.

GPJMs inherit the motivation of Bayesian joint modeling approaches accounting for brain–behavior reciprocity (9, 11⇓⇓–14, 21): imposing common statistical constraints for both neural and behavioral data when informing theoretical and mechanistic explanations of cognitive processes. However, instead of summarizing the relationship between neural and behavioral measures by collapsing across temporal information, the GPJM framework aims to learn latent temporal dynamics that simultaneously govern both neural and behavioral data. The GPJM framework is based on Gaussian process latent variable models (GPLVMs) and their hierarchical extensions (22⇓⇓–25), which pursue commonly shared latent representations across simultaneous observations from multiple measurement sources. GPLVMs have been successfully used in many machine-learning applications (26⇓⇓⇓⇓–31).

Joint models of neural and behavioral data consist of three components: a neural submodel, a behavioral submodel, and a linking function. As illustrated in Fig. 2, in GPJM a latent variable X with a GP prior enables a linking function that constrains the generative process for neural and behavioral data,

Given the latent cognitive dynamics X, a neural submodel describes the relationship between the latent dynamics at the hyperlevel and the neural data. Specifically, a neural submodel of the GPJM assumes that the neural data are described by a Gaussian process with an RBF kernel *GPJM and Technical Issues in Brain–Behavior Modeling*). To capture the residual noise in

A behavioral submodel provides a formal description of how cognitive dynamics produce behavioral responses. In typical joint modeling applications, cognitive models are used to describe the computations underlying decision processes. However, for this particular study we chose to remain agnostic about the computations that underlie the behavioral data generating process and instead use a Gaussian process as a statistical model to describe the continuous joystick trajectories, derived from the cognitive dynamics. For our purposes, the behavioral submodel incorporates the temporal dynamics using a Matérn 1/2 kernel,

In Eqs. **6** and **7**, length-scale parameters *B* illustrates the role of length-scale parameters: When the length-scale parameter is large, the covariance between two input points is weakly penalized by the distance between them. Hence, large length-scale parameters allow distant pairs of points to covary and make sampled functions relatively constant across input values (Fig. 2*B*). By contrast, when the length-scale parameter is small, only nearby input points can covary whereas distal points are uncorrelated. Automatic relevance determination (ARD) (20) extends this idea to each latent input by assigning one length-scale parameter per dimension. By applying ARD to the kernels of the behavioral and neural submodels, the GPJM can learn how much each dimension contributes to the data generation process, in other words, the relevance of the latent input in the context of the data (Fig. 2*A*, bottom right).

### GPJM and Technical Issues in Brain–Behavior Modeling.

One of the major issues in joint modeling approaches is the mismatch of the task-related timeline between neural and behavioral data. This problem is more salient when applying the model to data from fMRI experiments due to the hemodynamic lag between the neural response and the experimental event. Hemodynamic activities are characterized by a delayed increase in response, attaining a maximum value approximately 5–6 s after an event (e.g., stimulus presentation), a slow decrease in the neural response, and finally a drop in the neural response below the baseline, followed by a recovery to a baseline state. Because of this temporal profile, the hemodynamic lag obscures precise temporal profiles of neural activity.

The interaction between underlying neural activities and hemodynamics in the brain is typically modeled by convolving neural activations with a hemodynamic response function (HRF), which facilitates general linear model analyses of fMRI data (32). Early joint modeling approaches have taken advantage of this assumption and used event-wise neural activation estimates (33) for trial-level summaries of neural activity (12, 13).

In the Gaussian process framework, one way to enforce temporally lagged, convolved neural signals is to convolve the GP kernel with a secondary function—an HRF in our case. This approach is justified from the fact that convolution of a Gaussian process with another function is another Gaussian process (34⇓–36). In the GPJM, we use a kernel convolved with an HRF as a key component of the neural submodel. Here, the latent cognitive dynamics are expected to be relatively well aligned to the real-time neural activity because the resulting kernel instantiates hemodynamic lag.

To instantiate hemodynamic lag, we use the canonical double-gamma HRF,

Another methodological issue in joint modeling approaches is applying plausible spatiotemporal constraints on the neural data. For computational convenience, previous joint modeling applications have assumed statistical independence between regions of interest (ROIs) (e.g., refs. 11 and 12), which has clear limitations given the structural and functional connectivity of the brain. An ideal statistical representation would incorporate both spatial (e.g., nearby and well-connected ROIs should correlate) and temporal (e.g., activity at one time point in one ROI should affect activity at later time points in other, “downstream” ROIs) dependencies among ROIs. To instantiate these constraints, we separately define kernels for space

In the following two sections, we examine the potential of Gaussian processes as a linking function connecting neural and behavioral measures in two ways. First, we examine our ability to correctly recover the spatiotemporal dynamics in a simulated example using the Kronecker-product method. Next, we apply our GPJM to data from a real fMRI experiment in which subjects are asked to provide a continuous report of motion coherence.

## Simulation: The Kronecker Method for Spatiotemporal Modeling

Before applying the GPJM to data, it is important to evaluate our ability to accurately fit data and recover latent dynamics. In particular, we examine whether it is possible to recover accurate spatiotemporal dynamics of the Kronecker-product kernel governing the distribution of behavioral and neural data. In this section, we perform a simulation study with similar structure to that of the experiment we report below.

The latent cognitive states **6**) was fixed to one to avoid identifiability issues; specifically, this constraint is necessary because the two variance parameters corresponding to the spatial and temporal kernels cannot be distinguished after applying the Kronecker product.

Fig. 3*A* illustrates a subset of the three kernels used to generate the data (*Left*) and estimated from the data (*Right*): spatial (*Top*), temporal (*Middle*), and resulting Kronecker product (*Bottom*). Fig. 3*B* illustrates the true latent dynamics (*Top*) and the recovered latent dynamics (*Bottom*) as a joint trajectory through a two-dimensional space. Colors gradually changing across points specify the relative position in the cyclic temporal dynamics. Fig. 3*C* shows the data (black) and model predictions (neural, red; behavioral, blue) for the behavioral data (*Top*) and two selected voxels (*Middle* and *Bottom*). These plots reveal that 1) the spatiotemporal GPJM can capture the spatial relationship among voxels using the Kronecker method and 2) the model can estimate the latent dynamics that explain the generative process of the data with high accuracy. Unlike linear models, which may only suffer from invariance to orthogonal transformations, nonlinear models such as GPJM have many more degrees of freedom. Hence, because the latent dynamics do not have a naturally unique solution, we cannot expect our estimated latent dynamics to perfectly match the dynamics used to generate the data. Instead, to evaluate the quality of the extracted latent dynamics, one must comprehensively consider model fit to data and the topology of the recovered dynamics (e.g., the frequency of cyclical information and shape). The complexity of this issue will become more apparent in the context of experimental data.

Another issue worth considering is whether the GPJM can outperform a standard GP-based model fitted to a single modality of information. In *SI Appendix*, we discuss an additional simulation where we compared the GPJM’s performance to two unimodal GP models fitted to each stream of data separately. Analogous to covariance-based joint models (12), the GPJM’s fits to the joint stream of data are nearly as good as both of its unimodal counterparts, while still obeying the joint statistical structure in both streams of data within one cohesive model.

## fMRI Experiment

As a proof of concept, we next apply the GPJM to experimental data from an ongoing study. In the experiment, the participant was presented with a cloud of randomly moving dots, with a subset of those dots moving in a coherent direction. The proportion of dots exhibiting coherent movement had the possibility of changing every second. The proportion of dots formed the independent variable of coherence, which was defined by nine points along a left–right axis. The participant was instructed to report the average direction of the dots from the beginning of the trial to the current point in time using a joystick.

We used the GPJM defined in the simulation study above to capture dynamics of the neural and behavioral data in our task. However, to scale the model to fit the data, we made a few simplifications of the full spatiotemporal GPJM: 1) We performed an ROI-based analysis, 2) neural data were introduced into the model after averaging signals across voxels within a set of 16 ROIs, and 3) we used only a temporal kernel rather than the full spatiotemporal Kronecker-product method. Fig. 4*A* shows the ROIs, which were preselected using a general linear modeling (GLM) analysis (see *Materials and Methods* and *SI Appendix* for full details).

### Fits to Data.

Fig. 4 summarizes the results of the model fit to data. Although we tested GPJMs using three, four, five, and six latent dimensions, Fig. 4 summarizes the result of the three-dimensional model for visual clarity (but see *SI Appendix* for the four-dimensional model). Fig. 4*A* shows the ROIs that were first extracted from our GLM analysis, from which the neural time series were extracted. Fig. 4*B* shows some example neural (Fig. 4*B*, *Top* and *Middle*) and behavioral (Fig. 4*B*, *Bottom*) time series data (black) along with corresponding model fits (blue, behavior; red, neural). Fig. 4*C* illustrates the latent temporal dynamics that emerged in the GPJM using a 3D representation. Fig. 4*D* shows the time series information for each of the three latent dimensions. Some abnormal deviations of the latent dimensions are highlighted (gray) in Fig. 4*D*, and these deviations have corresponding paths in Fig. 4*C* (demarcated by orange dots).

### Topology of Latent Dynamics.

As a final examination of the GPJM, we can examine what the topology of the extracted latent dynamics provides as a function of aspects of our experiment. Although interpreting these latent dynamics is quite challenging as they emerge from a complex mixture of brain, behavior, and spatiotemporal dynamics, some interesting characterizations of the latent dynamics can be made based on the data to which the model is being applied. For example, when revisiting different points in the latent dimension through time, there may be some consistency in the pattern of brain data or the type of responses being generated by the participant (e.g., ref. 12), and such consistencies could be exploited when predicting future performance (*SI Appendix*).

Fig. 5 illustrates how the latent dynamics are associated with a few different key aspects of the experiment. Within Fig. 5 *A*–*C*, the same latent dynamics from Fig. 4*C* are shown, but color coded according to one of three variables: (Fig. 5*A*) stimulus coherence, (Fig. 5*B*) joystick position or behavioral response, or (Fig. 5*C*) functional coactivation. Fig. 5 *A* and *B* shows that although the responses for this participant closely tracked the stimulus coherence, there are interesting differences in how the topology describes these variables. As two prominent examples, when the stimulus was exhibiting near zero coherence and the behavioral response was strongly to the left, two significant departures from the central aspect of the topology are created, generating two orange ribbons in Fig. 5*B* in the top left region. Similarly, just below these two ribbons is a situation in which a strong rightward response was made to a near-zero coherence value.

We can also examine whether the topology reflects differences in the functional coactivation of the neural time series data. For this analysis, we applied a *k*-means clustering algorithm to a multidimensional scaling representation of the neural data to identify similar profiles of functional connectivity. Although we tested a few different cluster settings, Fig. 5*C* shows the results with three clusters, where the time points of the latent dynamics associated with the three clusters are colored red, yellow, and green, respectively. Although there is considerable overlap in the spatial location of these clusters, the yellow cluster tends toward the middle of the topology, whereas the green cluster tends to be associated with departures from this central structure. After the clusters were identified, Fig. 5 *D* and *E* shows the functional connectivity profiles associated with each cluster by expressing the pairwise functional correlations among sets of ROIs. For this analysis, we chose a cluster as a reference from which to visualize significant changes from one set of coactivations to another. Fig. 5*E* shows the functional coactivations among all ROIs within the reference cluster, cluster 3. Fig. 5*D* shows the coactivations in clusters 1 and 2, but here only coactivations that are significantly different from the coactivation pattern of cluster 3 are shown for visual clarity (see *SI Appendix* for raw coefficient values). Although there is some consistency between clusters 1 and 3, the coactivation pattern is generally stronger within cluster 1 relative to cluster 3. However, cluster 2 is characterized by a strong negative association between the left inferior frontal gyrus (left IFG; ROI C7) and many other ROIs. The left IFG is known to play a major role in the execution of inhibitory processes, such as those observed within stop signal paradigms (e.g., refs. 42 and 43). Here, a large negative coactivation of left IFG with many other brain areas suggests the inhibition or cancellation of the report of motion coherence, possibly with the intention of making momentary adjustments in the response.

## Discussion

In this article, we extended an existing framework for connecting mind, brain, and behavior by specifying Gaussian process linking functions for the dynamics of both latent and manifest variables. This extension allows a few key advantages that will be important in the field of model-based cognitive neuroscience (4, 7, 8, 44). First, our approach inherits all of the advantages that a hierarchical structure provides, such as statistical reciprocity through conditionally independent variables within a global model and flexibly accounting for missing observations (9, 45, 46). Second, the model’s representation of the mind is a latent dynamical system. Because the mind is a latent construct, we believe it is best specified as a set of conditionally independent variables that can be estimated, rather than as a transformation of either neural or behavioral data (see ref. 8, for discussions). Third, the representations of all variables within the GPJM are dynamic, allowing us to infer periodic changes at both local and global time scales. Fourth, the GPJM can be specified to allow for both spatial and temporal structure through the Kronecker product. The fusion of space and time in the latent dynamics allows us to appreciate the spatiotemporal constraints that any biological system must obey.

We have only begun to investigate the advantages provided by GPJM, and so at present there remain a number of important aspects of this innovation that merit further investigation. First, strategies for interpreting the representation inferred from the GPJM’s fit to data will be important for understanding the cognitive dynamics that emerge from the model. In the simulation study, although the latent dynamics shown in Fig. 3*B* separated two cognitive states from one another, there were other estimated dynamics that were not an essential part of the ground truth. Although complex and ambiguous representations often emerge in many other machine-learning applications, the field will require some type of guideline for interpretation of these cognitive dynamics if GPJMs are to advance our understanding of the connections between mind, brain, and behavior.

Another limitation is the resolution of the data in constraining the GPJM. Although our experimental application used a continuous motion tracking task, most experiments in psychology and cognitive science use discrete, “event-related” designs rather than block-type manipulations. Although GPJM could be used in a similar way to that of earlier joint modeling work (12) to exploit trial-to-trial changes in cognitive states (e.g., attention), additional theoretical innovation is needed to account for changes in cognitive states between two stimulus presentations.

From our perspective, the biggest challenge facing GPJMs is computation. To fully capitalize on all of the complexities that GPJMs offer, computational solutions will be needed to increase the scalability of our approach to full-brain, spatiotemporal investigations. Although there have been attempts to increase the scalability of joint models (11, 13) and Gaussian processes (47), many interesting opportunities are just out of reach. For example, hierarchical extensions for voxel-wise or group-level modeling are obvious next steps, but they are currently computationally infeasible. Alternatively, two promising directions are sparse GP approximations and variational inference (e.g., refs. 25, 47, and 48), but the tradeoffs of these approximations have yet to be well studied. For now, we propose the GPJM for its theoretical advantages and save computational innovations for future investigations.

## Conclusions

In this article, we have extended the basic joint modeling framework to incorporate nonlinear dynamics by Gaussian processes. At their core, Gaussian process joint models rely on GPs that fuse temporal and spatial information, allowing complex latent dynamics to emerge from the intrinsic biological dynamics of the brain. We view the GP structure as one way to alleviate the burden of specifying the particular shape of the linking function; instead, by specifying a flexible, nonparametric functional form, complex shapes of the linking function can emerge directly from complex patterns in the data.

## Materials and Methods

### Model Implementation and Training.

The model was implemented using GPflow (49), a Gaussian process modeling library based on TensorFlow (50). The linking function and neural submodel used RBF kernels, whereas the behavioral submodel used a Matérn 1/2 kernel.

For efficient training, the model initialized a k-dimensional latent variable using the first k principal components estimated from the neural data, following the suggestion from ref. 24. To match the number of time points between the principal components and behavioral data, we performed principal component analysis after interpolating the neural data at all missing observation points.

For the simulation, we used L-BFGS-B (51) to obtain maximum a posteriori estimates of the model parameters. Following the default setting of GPflow, optimization steps were limited to 1,000 iterations at maximum. For the fMRI experiment, we used a stochastic optimization algorithm Adam (52). The default learning rate recommended by the authors (

### Participant.

The data presented here were acquired, after obtaining informed consent, from a 19-y-old male as part of a larger study that is ongoing. All participants in the dataset have normal or corrected-to-normal vision and are self-reported to be right-handed. The study was approved by The Ohio State University’s Institutional Review Board.

### Task and Stimuli.

The participant completed four blocks of a random dot motion task in which he was instructed to continuously track the direction (left or right) and the magnitude of coherence of the dots throughout each trial. Each block contained 20 trials separated by intertrial intervals ranging from 6 to 8 s. Trials began with a 2-s fixation cross at the center of the screen followed by 30 s of the randomly moving dots stimulus centered on the screen. There were 150 dots that were each 3 × 3 pixels and the diameter of the entire dot cloud subtended 5.4° of visual angle. The proportion of coherently moving dots ranged from −0.35 to 0.35 (possible values were −0.35, −0.25, −0.15, −0.05, 0, 0.05, 0.15, 0.25, 0.35), where negative coherence indicates leftward motion, positive coherence indicates rightward motion, and zero indicates uniform circular noise. Coherence for each trial was initialized by randomly sampling one coherence level from the set of all possible values. This first state variable was also initialized to have an “age” of zero seconds. At each new second, a shift in the value of the coherence level was stochastically considered. Specifically, we sampled a value

### Behavioral Data Processing.

SMILE recorded the cursor position with adaptively changing sampling rate, changing from 1 Hz (when the cursor does not move) to 332 Hz. For the modeling purpose, we interpolated and down-sampled the data to 2 Hz for the neural GLM analyses and the GPJM.

The horizontal cursor position is not appropriate to model with the assumption of Gaussian likelihood because the cursor position is bounded to the range of

### MRI Data Analysis.

fMRI data preprocessing and analysis were performed primarily with the fMRI Expert Analysis Tool (FEAT) (53) Version 6.00 in FMRIB’s Software Library (FSL) (https://fsl.fmrib.ox.ac.uk/fsl/). A typical preprocessing pipeline including motion correction, fieldmap-based echo planar imaging distortion correction, removal of nonbrain structures, spatial smoothing, high-pass filtering, and normalization via grand-mean scaling was used to clean each block of the four-dimensional task data. Additionally, the T1 image was segmented into gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). EPI data were registered first to the T1-weighted image and then transformed to standard space using the same transformation matrix acquired from registering the T1 image to standard space. See *SI Appendix* for complete preprocessing details.

After preprocessing, FSL’s general linear model tool (FILM; ref. 54) was used to calculate activity estimates for the preprocessed functional data from each block. Three primary predictors were included in the model: participant responses, nonzero stimulus coherence, and zero stimulus coherence. Participant responses and stimulus coherence were rescaled such that they ranged from −1 to 1. The predictors were convolved with a double-gamma HRF to create the main regressors. Nuisance regressors including mean WM and CSF signal as well as 24 realignment estimates (X, Y, Z, pitch, yaw, roll, and their first- and second-order temporal derivatives) were also added to the model to account for motion and other variance of no interest. The temporal derivatives of all regressors in the model were also added as confounds of no interest. Finally, the time series was prewhitened within FILM to correct for autocorrelations in the BOLD signal. Contrasts for each of the three predictors of interest were calculated as the effect of interest versus no activity (i.e., zero). Additional contrasts included nonzero coherence greater than response, nonzero coherence less than response, nonzero coherence greater than zero coherence, and nonzero coherence less than zero coherence.

Following the block-wise analyses, a fixed-effects analysis was conducted in FEAT to assess the average effects of each contrast collapsed across blocks. The resulting clusters of the fixed-effects analysis were thresholded at

## Data Availability.

All code and data that accompany this article can be downloaded from GitHub: http://github.com/MbCN-lab/gpjm and http://github.com/GiwonBahg/gpjm. The fMRI dataset is available in Open Science Framework (https://osf.io/vabrh/?view_only=b47ff7bedfa34f3e8ea78bd04bcb28d0).

## Acknowledgments

We thank Michael Shvartsman for helpful discussions that motivated the present research. This research was supported by a Faculty Early Career Development (CAREER) award from the National Science Foundation (to B.M.T.).

## Footnotes

- ↵
^{1}To whom correspondence may be addressed. Email: turner.826{at}gmail.com.

Author contributions: G.B. and B.M.T. designed research; D.G.E. collected data; G.B. and B.M.T. contributed new reagents/analytic tools; G.B., D.G.E., and M.G. analyzed data; and G.B., D.G.E., M.G., and B.M.T. wrote the paper.

The authors declare no competing interest.

This paper results from the Arthur M. Sackler Colloquium of the National Academy of Sciences, “Brain Produces Mind by Modeling,” held May 1–3, 2019, at the Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering in Irvine, CA. NAS colloquia began in 1991 and have been published in PNAS since 1995. From February 2001 through May 2019, colloquia were supported by a generous gift from The Dame Jillian and Dr. Arthur M. Sackler Foundation for the Arts, Sciences, & Humanities, in memory of Dame Sackler’s husband, Arthur M. Sackler. The complete program and video recordings of most presentations are available on the NAS website at http://www.nasonline.org/brain-produces-mind-by.

This article is a PNAS Direct Submission. N.K. is a guest editor invited by the Editorial Board.

↵

^{†}To avoid confusion between the kernel variance and the variance of the noisy observations, we refer to the latter as “observation noise” and the former as “variance.”This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1912342117/-/DCSupplemental.

Published under the PNAS license.

## References

- ↵
- G. S. Brindley

- ↵
- ↵
- ↵
- B. M. Turner,
- J. J. Palestro,
- S. Miletić,
- B. U. Forstmann

- ↵
- B. U. Forstmann et al.

- ↵
- B. U. Forstmann et al.

- ↵
- ↵
- ↵
- ↵
- ↵
- B. M. Turner,
- C. A. Rodriguez,
- T. M. Norcia,
- S. M. McClure,
- M. Steyvers

- ↵
- ↵
- B. M. Turner,
- T. Wang,
- E. Merkel

- ↵
- ↵
- ↵
- ↵
- ↵
- C. K. I. Williams,
- C. E. Rasmussen

- ↵
- E. Schulz,
- M. Speekenbrink,
- A. Krause

- ↵
- R. M. Neal

- ↵
- M. Shvartsman,
- N. Sundaram,
- M. C. Aoi,
- A. Charles,
- T. C. Wilke,
- J. D. Cohen

- ↵
- N. D. Lawrence

- ↵
- N. D. Lawrence

- ↵
- N. Lawrence

- ↵
- M. Titsias,
- N. D. Lawrence

- ↵
- A. Shon,
- K. Grochow,
- A. Hertzmann,
- R. Rao

- ↵
- C. H. Ek

- ↵
- A. C. Damianou,
- C. H. Ek,
- M. K. Titsias,
- N. D. Lawrence

- ↵
- G. Song,
- S. Wang,
- Q. Huang,
- Q. Tian

- ↵
- ↵
- A. Wu,
- N. A. Roy,
- S. Keeley,
- J. W. Pillow

- ↵
- ↵
- ↵
- D. Higdon

- ↵
- M. Álvarez,
- N. D. Lawrence

- ↵
- M. A. Álvarez,
- N. D. Lawrence

- ↵
- ↵
- E. V. Bonilla,
- K. M. Chai,
- C. Williams

- ↵
- S. Flaxman,
- A. Wilson,
- D. Neill,
- H. Nickisch,
- A. Smola

- ↵
- J. Wilzen,
- A. Eklund,
- M. Villani

- ↵
- ↵
- ↵
- ↵
- B. U. Forstmann,
- E.-J. Wagenmakers

- ↵
- B. M. Turner

- ↵
- B. M. Turner,
- B. U. Forstmann,
- M. Steyvers

- ↵
- S. Ahmed,
- M. Rattray,
- A. Boukouvalas

- ↵
- ↵
- D. G. Matthews et al.

- ↵
- M. Abadi et al.

- ↵
- ↵
- D. P. Kingma,
- J. Ba

- ↵
- ↵

## Citation Manager Formats

## Article Classifications

- Social Sciences
- Psychological and Cognitive Sciences

- Biological Sciences
- Neuroscience