## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Case for fMRI data repositories

Functional magnetic resonance imaging (fMRI) technology for studying the human brain is the result of considerable efforts of distinguished physicists and engineers (1, 2). Despite concerns about some of its conclusions (3), it is now a common tool in neuroscience, psychology, and psychiatry. In these applications, statistical methods provide many of the tools for both the preprocessing of fMRI signals and the analysis of those processed data (4). These statistical procedures are implemented in software packages whose algorithms are rather involved, typically using parametric models for the error distribution and the spatial autocorrelation function. They are used to study both individual voxels and clusters of voxels. As a result, there are many thousands of fMRI-based publications. In PNAS, Eklund et al. (5) report on their studies of the statistical aspects of fMRI studies. In particular, they assess the performance of three software packages on datasets made available through sharing agreements. They tell a sobering tale: Family-wise error (FWE) rates that should be about 5% can be much higher, there was a bug in software that led to inflated error rates for 15 y, and a survey of 241 recent fMRI papers indicates that about 40% did not report using well-known methods for correcting for multiple testing. These findings cast a shadow of doubt on those earlier publications. However, their work also provides a template for a way forward; namely, encourage the sharing of data and use them to test statistical methods and software to improve fMRI-based research, for example, by reducing the number of nonreproducible findings (6).

## An Assessment of Current fMRI Practice

The discipline of statistics is undergoing a period of rapid growth (7, 8). As a result, new methods for analyzing large, high-dimensional data are being developed rapidly, with fMRI being a standard source of high-dimensional data (consider the signal for each voxel as a dimension). The assessment of these methods to understand how they perform is done in many ways. A principal classical approach is asymptotics (9), in which the sample size increases. However, this approach is often intractable for high-dimensional data and is of limited relevance when datasets are not sufficiently large. There is also evidence that methods (e.g., bootstrapping) that work well in classical problems break down in high dimensions (10). Simulation studies are common in such cases (11), but as Eklund et al. (5) note, it is “hard to simulate the complex spatiotemporal noise that arises from a human subject in an MR scanner.” Thus, access to actual datasets that can be used as test beds to assess statistical methods and to develop code for their implementation is important. Eklund et al. (5) use data from the 1000 Functional Connectomes Project (12) and encourage scientists to share their data with the OpenfMRI project (13), which, to date, has 49 raw MRI datasets on 1,811 subjects. Such repositories can also facilitate the use of meta-analysis to reduce further the problem of false-positive results.

The motivation in the study by Eklund et al. (5) comes from their earlier work (14), in which they used SPM software on 1,484 resting states for task-based, single-subject fMRI analyses, and found that the false-positive rates were as high as 70% rather than the expected 5%. To understand the sources of the higher error rates, they expanded their scope by including the three most common software packages, SPM, FSL, and AFNI, generally using each package’s default settings. They used resting state fMRI data from 499 healthy controls from three sites obtained from the 1000 Functional Connectomes Project. They then mimicked several activity paradigms, and used one-sample and two-sample *t* tests, controlling for the FWE rates for both voxel-wise and cluster-wise inference. Because resting state data should not have systematic changes in brain activity, the error rates should have been around 5%. Their main findings are that all packages were conservative for voxel-wise inference but not for cluster-wise inference. In particular, the rates using a parametric Gaussian spatial autocorrelation gave much higher FWE rates; instead, a nonparametric approach using a permutation test gave valid inferences. They then did extensive exploratory data analysis to try to identify the reasons for the poor performance. They concluded that the main reason was the misspecification of the spatial autocorrelation function; they also discuss many other aspects of their findings, such as possible reasons for differences between voxel-wise and cluster-wise inference, as well as for differences between the software packages (which presumably led to finding the long-standing bug).

## Remedies Old and New

The nonparametric permutation test is an old method (15) that is based on the simple idea that if the null hypothesis holds (for, say, the two-sample problem), then the two groups are arbitrary labels. It uses all permutations of the labels into two groups to assess the significance of the observed difference. The number of permutations is usually too large to do all of them; typically, only a sample of several thousand permutations suffices to get a good estimate of the significance level. Using the permutation test, Eklund et al. (5) are able to show that the actual data had much longer tails than the Gaussian model. The permutation test is not a panacea, however, because it appears to be invalid for the one-sample test they studied; they attribute that finding to asymmetrical errors, but a more careful analysis of that case is needed.

Eklund et al. (5) focus on the simplest analyses, the one-sample and two-sample problems. In most studies, there are also many covariates that may help explain the variation in fMRI data. When the number of covariates is large, some sort of model selection method, such as the lasso (16), is used to whittle down the number of covariates. In such cases, another strain of recent statistical research that is relevant is called adaptive inference (17, 18). In practice, the same data are typically used to decide which covariates and their interactions enter a regression or classification model, and also to assess the significance of the regression coefficients or error rate, respectively (for example, see refs. 19 and 20). This approach violates the often implicit assumption that the selection of the model is made before the data are seen, making inferences about the regression coefficients in the model difficult. The adaptive inference approach aims to produce estimates and confidence intervals that are more honest by accounting for the selection of the covariates from the same data. It would be useful to use the adaptive inference framework to assess the magnitude of improvement in the false-positive rates.

## Acknowledgments

The author’s research is supported by National Institute of Mental Health Grant R01 MH100041-01A1.

## Footnotes

- ↵
^{1}Email: ssi{at}pitt.edu.

Author contributions: S.I. wrote the paper.

The author declares no conflict of interest.

See companion article on page 7900.

## References

- ↵.
- Uludag K, et al.

- ↵.
- Ogawa S, et al.

- ↵.
- Vul E,
- Harris C,
- Winkielman P,
- Pashler H

- ↵.
- Ashby G

- ↵.
- Eklund A,
- Nichols TE,
- Knutsson H

- ↵
- ↵.
- Lin X, et al.

- ↵Committee on the Analysis of Massive Data, et al. (2013).
*Frontiers in Massive Data Analysis*(National Academy Press, Washington, DC). - ↵.
- van der Vaart AW

- ↵.
- El Karoui N,
- Purdom E

- ↵
- ↵.
- Biswal B, et al.

- ↵
- ↵
- ↵.
- Fisher RA

- ↵.
- Hastie T,
- Tibshirani R,
- Friedman JH

- ↵.
- Taylor J, et al.

- ↵
- ↵
- ↵

## Citation Manager Formats

### More Articles of This Classification

### Related Content

### Cited by...

- No citing articles found.