Endogenous fluctuations in the dopaminergic midbrain drive behavioral choice variability

Significance Humans are surprisingly inconsistent in their behavior, often making different choices under identical conditions. Previous research suggests that intrinsic fluctuations in brain activity can influence low-level processes, such as the amount of force applied in a motor response. Here, we show that intrinsic prestimulus brain activity in the dopaminergic midbrain influences how we choose between risky and safe options. Using computational modeling, we demonstrate that endogenous fluctuations alter phasic responses in a decision network and thereby modulate risk taking. Our findings demonstrate that higher-order cognition is influenced by fluctuations in internal brain states, providing a physiological basis for variability in complex human behavior.


Real-time fMRI
Physiological Noise. To remove physiological noise arising from breathing and pulsatile artifacts, subjects were fitted with a pneumatic respiratory belt and a pulse oximeter. Physiological measurements from these devices were modeled using a Fourier expansion of physiological phases based on the RETROICOR model 1 and respiratory volume 2 . These were incrementally regressed out in real time from the exported time courses using a custom-made MATLAB (MathWorks, Natick, USA) toolbox. The ensuing filtered time courses were then used in the main experiment.

fMRI Image Acquisition
MRI was acquired at the Wellcome Centre for Human Neuroimaging, UCL, using a Siemens Trio 3-Tesla scanner equipped with a 32-channel head coil. A partial-volume 2D echo-planar imaging (EPI) sequence that was optimized for striatal, medial prefrontal, and brainstem regions was selected for the functional images. Each volume consisted of 25 slices with 2.5mm isotropic voxels (repetition time (TR): 1.75s; echo time (TE): 30ms; slice tilt: -30°). At the beginning of each functional session, 10 EPI volumes were acquired with the 10 th volume selected as the template used to co-register the ROI. In addition, field maps with 3mm isotropic voxels (whole brain coverage) were also acquired to correct the EPIs for any inhomogeneity in magnetic field strength. Subsequently, the first 6 volumes of each run were discarded to allow for T1 saturation effects. Structural images consisted of 3 spoiled multiecho 3D fast low angle shot (FLASH) acquisitions at 0.8mm isotropic resolution with T1 (TR: 18.7ms; flip angle: 20°), proton density (PD) (TR: 23.7ms; flip angle: 6°), and magnetization transfer (MT) (TR: 23.7ms; flip angle: 6°; excitation preceded by a 2kHz offresonance Gaussian radiofrequency (RF) pulse with 4ms duration and 200° nominal flip angle) weightings. Additional B1 mapping and field maps were acquired to get calibration data measuring the spatial distribution of the B1+ transmit field in order to detect the spatial variation in flip angle. Sequence settings were identical across subjects (e.g., no variation in tilt angle) and no slices were discarded. Overlapping coverage across all subjects is indicated in Figure 1.

fMRI Offline Analyses
Images were preprocessed using standard procedures in SPM 12 (Wellcome Centre for Human Neuroimaging, UCL). This consisted of unwarping EPIs using field maps, motion correction, spatial transformation to the MNI template, and spatial smoothing with a 6-mm full-width at half-maximum Gaussian kernel.
Multilevel mediation analysis was carried out using the Mediation Toolbox (http://wagerlab.colorado.edu/tools) 3 . For the mediation analysis, evoked responses in the SN/VTA, VS, and vmPFC were determined as the maximum percentage change in BOLD signal within a 10s epoch following trial onset, while baseline SN/VTA was determined as the percentile that each trial was triggered off (see previous section). Distribution of path coefficients were estimated by drawing 10,000 random samples and significance estimates were computed through bootstrapping.

Computational Modeling
Parametric Approach-Avoidance Decision Model. A recent model 4 that was developed to account for value-independent tendencies to choose gambles is the approach-avoidance model, which allows choice probabilities to differ from 0 or 1 in the limit when a softmax rule is used. Expected utilities were determined using equations in the prospect theory model described earlier.
The main difference lies in the softmax rule where the probability of gambling depended on a new parameter, β, determined by the following equations: If β is positive, choice probabilities are mapped from (β, 1). If β is negative, choice probabilities are mapped from (0, 1+β). This model provided a good fit of behavior with an average pseudo-R 2 of 0.47 (SD: 0.14).

Parametric Decision Model Using Expected
Values. The final model tested was one that used the expected values of the gamble (Egamble) and certain gain (Ecertain) and passed through the following softmax with the same gambling bias term β as before: This model had the lowest fit with a pseudo-R 2 of 0.36 (SD: 0.17), which suggests that more of the variance could be accounted for by the inclusion of a risk aversion parameter to convert objective values into subjective values.

Control Analysis
To validate the results obtained from our online procedure and to examine whether the effect of endogenous BOLD activity on risky choice behavior was a general property across the brain, we sampled activity from multiple regions. The ROI for vmPFC was derived from www.neurosynth.org, the VS ROI was bilateral 8-mm spheres at MNI coordinates derived from a previous study 5 , a group anatomical mask from a previous study 6 was used for SN/VTA ROI, and the primary auditory cortex (A1) was Brodmann Areas 41 from the Wake Forest University PickAtlas toolbox for SPM 7 .
BOLD time courses for these ROIs were extracted and filtered using an incremental GLM with the same motion and physiological regressors as in the real-time fMRI experiment. Based on our real-time procedure, BOLD activity for each region was averaged for the 2 most recent TRs prior to trial presentation and compared against each preceding baseline window of 2 minutes. As our design was optimized to detect activity fluctuations in the SN/VTA, the threshold used to categorize trials as low or high activity in the SN/VTA would be overly conservative when applied to other brain regions. This would lead to many trials being left uncategorized. To ensure that all trials were categorized, we relaxed the threshold and categorized each trial as low or high depending on whether pre-trial BOLD activity for each of these regions was lower or higher than the mean of the preceding baseline period.
To test whether our main effect of risk preference change is specific to SN/VTA BOLD activity, we investigated the relationship between endogenous fluctuations of BOLD activity in other brain regions and risky choice. We conducted offline analyses on A1 as a control area, as well as VS and vmPFC, which are regions strongly implicated in value-based decision making 8 . We used independent ROIs for all areas including SN/VTA and recategorized trials based on endogenous activity in each of these ROIs.
To further verify that the effects we observe are driven by local rather than global fluctuations, we tested whether SN/VTA activity was still predictive of risk taking even after controlling for activity in control area A1 (t42 = 2.34, P = 0.02). These findings suggest that the effect is not a general effect of low and high BOLD activity modes across the brain, but specific to local fluctuations in the dopaminergic midbrain that explain variability in risk taking.
A caveat of the above analysis is that the absence of any effect in a control area could be due to reduced endogenous signal variability. To rule out this alternative explanation, we calculated the signal change of epochs used to trigger each trial relative to their preceding baselines. Differences in signal change between low and high activity conditions were largest in vmPFC and smallest in VS, suggesting that activity used to trigger trials in SN/VTA was no more extreme than that observed in other regions, supporting our finding of a specific effect of SN/VTA endogenous fluctuations on risk taking (Fig. S2B).
As the VS results may be affected by partial volume effects due to its location and the image acquisition parameters, we re-ran the preprocessing steps and re-analyzed the data after discarding the top and bottom slices of the partial volumes. We found that risk taking was still similar for low and high baseline activity in VS (low: 58.7 ± 1.7%, high: 57.0 ± 1.9%, t42 = 1.02, P = 0.32), suggesting that the absence of an association between VS BOLD activity and risk taking was not due to partial volume effects.
As SN/VTA BOLD signals recorded in real-time may be contaminated by signals from surrounding structures due to smoothing, we also performed offline analyses on unsmoothed functional images using the same algorithm to reclassify pre-stimulus activity and found consistent results in unsmoothed data. Risk taking was higher for trials presented against a background of low compared to high SN/VTA BOLD activity (low activity: 59.9 ± 1.8%, high activity: 55.6 ± 2.1%, t42 = 3.2, P = 0.003).
To test how sensitive the effect we observe is to the timing of pre-stimulus activity, we reanalysed the data, reclassifying activity levels as high or low based on volumes t-2 and t-3 before trial onset (instead of t-1 and t-2). Discarding the final volume of SN/VTA signal before trial onset did not affect the relationship between pre-stimulus activity and risk taking (t42 = 2.95, P = 0.005), suggesting that the effect we observe does not depend on the precise timing of option presentation.

Statistical Analysis
Descriptive and inferential statistics were carried out in MATLAB (MathWorks) with inhouse scripts and functions in SPSS Statistics (IBM Corp). All behavioral analyses were conducted on trials that were matched for value between low and high activity modes. In other words, if a participant missed a trial in the low or high baseline activity condition, the corresponding trial in the other baseline condition was excluded from analyses to match the number of trials in each condition (final number of matched trials: 95 ± 6%, mean ± SD).
Precision measures (e.g., SD, SEM) are indicated in brackets where appropriate. Paired sample t-tests were used to compare reaction times and the number of risky choices between low and high activity conditions across the different brain regions tested. The main effect of gamble value bins and risk, as well as possible interactions between gamble value and endogenous activity were assessed using a repeated-measures ANOVA with Greenhouse-Geisser correction (5 gamble value bins x 2 activity conditions). Parameters in the computational models were fit separately for each condition using the fmincon function in MATLAB to minimize their negative log-likelihoods. were not more extreme than other regions. * P < 0.05, ** P < 0.01, *** P < 0.001. Data are mean ± SEM.      Coefficients, standard errors, and p-values for the different paths in the mediation analyses (n=43). *P < 0.05, **P < 0.01, ***P < 0.001