Formation of global self-beliefs in the human brain

Significance Momentary feelings of confidence accompany many of our actions and decisions. In addition to such “local” feelings of confidence, we also construct “global” confidence estimates about our skills and abilities (global self-performance estimates or SPEs). Distorted SPEs may have a pervasive impact on motivation and self-evaluation, for instance affecting estimates of our competitiveness at work or in a sports team. Here, we found that components of a brain network previously implicated in the tracking of local confidence was additionally modulated by SPE level, whereas ventral striatum tracked SPEs irrespective of confidence. Our findings of a neurocognitive basis for global SPEs lay the groundwork for understanding how distorted SPEs arise in educational and clinical settings.

Together these results indicate that subjects gave reliable confidence ratings as a function of our experimental parameters.

Regression models predicting confidence from the metacognition task
Our goal was to predict local confidence in the fMRI experiment from a model fitted to the metacognition task data collected outside of the scanner. We adopted this approach, rather than eliciting confidence ratings in the scanner, for several reasons. First, we were keen to avoid interrupting the process of learning global self-performance estimates (SPEs) by requiring trial-by-trial explicit confidence ratings. Second, it has become increasingly clear that the neural and computational processes involved in forming explicit ratings are likely to be at least partly distinct from those involved in the formation of latent or "implicit" confidence estimates. The former appears to engage anterior-lateral PFC (1, 2), which we interpret as supporting a mapping between a "latent" and an "explicit" estimate of confidence (3). It remains an open question as to whether such a transformation would impact the formation of global SPEs -and future work may wish to specifically investigate the extent to which implicit and explicit confidence representations differentially contribute to global SPEs. A final benefit of omitting explicit ratings in the scanner was the ability to obtain more trials per scanning session. Had we had introduced trial by trial confidence ratings, the time needed for an individual trial would have been substantially increased.
Accordingly, we sought to build a regression model to infer trial-by-trial variations in local confidence in the absence of explicit reports in the fMRI experiment. First, we compared a set of regression models representing the various parameters of our experimental design that contributed to confidence reports in the metacognition task ( Fig. S2A) (see also Methods). As expected, all models included accuracy (all β>0.27, all p<4.7e-11) and RTs (all β<-0.61, all p<4.4e-12) as significant predictors. We omitted lagged terms for the influence of previous confidence ratings as these could not be estimated in the absence of explicit reports in the fMRI experiment, despite finding that they improved the fit to the metacognition task data, in line with previous findings (4).
Quantitatively, as a criterion for arbitrating between models, we used deviance, a goodness-of-fit statistic reflecting model evidence for each regression model (as returned by Matlab's mnrfit function, with lower deviance indicating better model). We found that the best model was Model 3, which included additional predictors for difficulty (β=.11, p=.043) and the interaction between accuracy and difficulty (β=.14, p=.19), consistent with the results of our model-free ANOVA (as indicated by the lowest cumulated deviance=22487; in all other models, cumulated deviance>22548). Note that here we were not interested in penalising for model complexity or abitrating between different models, but only in predicting confidence as accurately as possible in the fMRI experiment. To establish whether this approach was able to identify idiosyncratic fluctuations in local confidence, for each subject s, we compute the predicted confidence under Model 3, either when fitted to the subject's own data (s = i), or fitted to each other subject i's data (s ≠ i). We then computed the sum of squared error between predicted and reported confidence for each pairing of s and i. We found that each subject was consistently better predicted by the confidence model fitted on their own data (s = i) than on any other subject's data (s ≠ i) (mean rank = 1). We also visually inspected the correspondence between observed and predicted confidence from our selected Model 3, confirming that the predictions were meaningful (Fig. S2E). We note that predicted confidence fluctuations were of smaller amplitude to those measured in the metacognition task (Fig. S2E). Importantly however, the direction and amplitude of confidence fluctuations were properly captured, such that even if baseline confidence were altered by factors such as noise or stress inside the scanner, we were able to meaningfully identify confidence-related activations in a linear model of the BOLD response, in which parametric modulations were z-scored.
We also note that using confidence estimates extracted from Model 5 (second best-fitting model) gave very similar results. For instance, we found higher predicted confidence on trials associated with higher global SPEs (t 40 =6.5, p=9.7e-08). We again found more confidence-related activity on trials associated with lower SPEs in vmPFC (p=1.22e-15, FWE-corrected for multiple comparisons at the cluster level) and PRECU (p=8.54e-10, FWE-corrected for multiple comparisons at the cluster level), and a main effect of global SPEs in bilateral ventral striatum (both p<4.2e-04, FWE-corrected for multiple comparisons at the cluster level), in line with results obtained using Model 3 (Fig. 1D, 2 and 3). The local confidence network identified here also closely matches that reported in previous work using in-scanner explicit confidence reports, supporting the validity of our approach (5,6).

Metacognitive ability
Using the metacognition task, we quantified each subject's metacognitive efficiency (meta-d'/d') and type-2 Area Under the Receiver Operating Curve (AUROC2) metrics (see Methods) for each difficulty level. Using a hierarchical fit, we found a metacognitive efficiency of 0.53 for easy and 0.67 for difficult trials, a difference that was not significant (HDI of the difference between easy and difficult trials overlaps with zero: [-0.325, 0.053]) (Fig. S3A). Averaging over easy and difficult conditions indicated a metacognitive efficiency of 0.60 and an AUROC2 of 0.62 (0.64 for easy trials and 0.60 for difficult trials, Fig. S3B), in line with previous studies of metaperception (5).
We found mixed relationships between local metacognitive ability estimated on the metacognition task and the accuracy of global task choices in the fMRI experiment. In the main text we define global SPE sensitivity with respect to objective difficulty level (i), but an alternative definition would be with respect to performance fluctuations (ii). These two definitions of global SPE sensitivity were strongly correlated across subjects (ρ=0.57, p=.0001), suggesting a general capacity for "global" metacognition. While we found no correlation between global SPE sensitivity under definition (i) and metacognitive efficiency (ρ=0.13, p=.43) or AUROC2 (ρ=0.17, p=.29), when using definition (ii) we identified a significant correlation between the frequency of choosing the best-performed task and both metacognitive efficiency (ρ=0.41, p=.007) and AUROC2 (ρ=0.32, p=.04).

Figure S1. Subjects' behavior (N=41) In the fMRI experiment, subjects performed better (A) and responded faster (B) on easy compared to difficult trials. Similarly in the metacognition task, subjects performed better (C) and responded faster (D) on easy compared to difficult trials. Error bars represent S.E.M. over participants and black circles represent individual data points.
Diff.

Subjects' behavior in the metacognition task outside of the scanner (N=41) A) Set of five regression models predicting confidence. Predictors are accuracy (ACC), reaction times (RT), difficulty level (DIF) and the interaction between accuracy and difficulty level (INT). B) Confidence distributions indicating that subjects gave higher confidence ratings for correct (red) than error (blue) trials. C) Confidence increased for easy as compared to difficult trials, and for correct as compared to error trials (see Results), both for subjects (red/blue) and Model 3 predictions (purple). D) Predicted confidence in the fMRI experiment according to Model 3 (purple) fitted on the metacognition task data, for correct (upper line) and incorrect (lower line) trials. Error bars represent S.E.M across subjects. E) Predicted confidence from selected Model 3 on a subset of trials (purple) for an example subject, plotted together with their confidence responses in the metacognition task (black) (for illustration purposes).
Diff.