Circular analysis in systems neuroscience: the dangers of double dipping

doi:10.1038/nn.2303

Review

. 2009 May;12(5):535-40.

doi: 10.1038/nn.2303.

Circular analysis in systems neuroscience: the dangers of double dipping

Nikolaus Kriegeskorte¹, W Kyle Simmons, Patrick S F Bellgowan, Chris I Baker

Affiliations

PMID: 19396166
PMCID: PMC2841687
DOI: 10.1038/nn.2303

Review

Circular analysis in systems neuroscience: the dangers of double dipping

Nikolaus Kriegeskorte et al. Nat Neurosci. 2009 May.

. 2009 May;12(5):535-40.

doi: 10.1038/nn.2303.

Authors

Nikolaus Kriegeskorte¹, W Kyle Simmons, Patrick S F Bellgowan, Chris I Baker

Affiliation

¹ Laboratory of Brain and Cognition, US National Institute of Mental Health, Bethesda, Maryland, USA. [email protected]

PMID: 19396166
PMCID: PMC2841687
DOI: 10.1038/nn.2303

Abstract

A neuroscientific experiment typically generates a large amount of data, of which only a small fraction is analyzed in detail and presented in a publication. However, selection among noisy measurements can render circular an otherwise appropriate analysis and invalidate results. Here we argue that systems neuroscience needs to adjust some widespread practices to avoid the circularity that can arise from selection. In particular, 'double dipping', the use of the same dataset for selection and selective analysis, will give distorted descriptive statistics and invalid statistical inference whenever the results statistics are not inherently independent of the selection criteria under the null hypothesis. To demonstrate the problem, we apply widely used analyses to noise data known to not contain the experimental effects in question. Spurious effects can appear in the context of both univariate activation analysis and multivariate pattern-information analysis. We suggest a policy for avoiding circularity.

PubMed Disclaimer

Figures

**Fig. 1. Intuitive diagrams for understanding circular analysis**
(a) The top row serves to remind us that our results reflect our data indirectly: through the lens of an often complicated analysis, whose assumptions are not always fully explicit. The bottom row illustrates how the assumptions (and hypotheses) can interact with the data to shape the results. Ideally (bottom left), the results reflect some aspect of the data (blue) without distortion (although the assumptions will determine what aspect of the data is reflected in the results). But sometimes (bottom center) a close inspection of the analysis reveals that the data get lost in the process and the assumptions (red) predetermine the results. In that case the analysis is completely circular (red dotted line). More frequently in practice (bottom right), the assumptions tinge the results (magenta). The results are then distorted by circularity, but still reflect the data to some degree (magenta dotted lines). (b) Three diagrams illustrate the three most common causes of circularity: selection (left), weighting (center), and sorting (right). Selection, weighting, and sorting criteria reflect assumptions and hypotheses (red). Each of the three can tinge the results, distorting the estimates presented and invalidating statistical tests, if the results statistics are not independent of the criteria for selection, weighting, or sorting.

**Fig. 2. Example 1: Data selection can bias pattern-information analysis**
(a) In order to assess to what extent human inferior-temporal activity patterns reflect bottom-up sensory signals and top-down task constraints, we measured activity patterns with fMRI while subjects viewed object images of different categories and judged either whether the object shown was “animate” (task 1) or whether it was “pleasant” (task 2). (b) We selected all inferior-temporal voxels for which any two-sided t test contrasting two conditions was significant at p<0.001 (uncorrected for multiple tests). We then cleanly divided the data by using odd runs for training and even runs for testing. We used a linear classifier to determine whether the activity pattern would allow us to decode the stimulus category (light gray bars) and the judgment task (dark gray bars). Results (top left) suggested that both stimulus and task can be decoded with high accuracy, significantly above chance. However, application of the same analysis to Gaussian random data (top right), also suggested high decoding accuracies significantly above chance. This shows that spurious effects can appear when data from the test set is used in the initial data-selection process. Such spurious effects can be avoided by performing selection using data independent of the test data (bottom row). Error bars indicate +/−1 across-subject standard error of the mean. For details on experiment and analysis, see *Example 1: Pattern-information analysis*.

**Fig. 3. Example 2: ROI definition can bias activation analysis**
A simulated fMRI block-design experiment demonstrates that nonindependent ROI definition can distort effects and produce spuriously significant results, even when the ROI is defined by rigorous mapping procedures (accounting for multiple tests) and highlights a truly activated region. Error bars indicate +/− 1 standard error of the mean. (a) The layout of this panel matches the intuitive diagrams of Fig. 1a: The data in Fig. 1a correspond to the true effects (left); the assumptions to the contrast hypothesis (top), and the results to ROI-average activation analyses (right). A 100-voxel region (blue contour in central slice map) was simulated to be active during conditions A and B, but not during conditions C and D (left). The t map for contrast A-D is shown for the central slice through the region (center). When thresholded at p<0.05 (corrected for multiple tests by a cluster threshold criterion), a cluster appears (magenta contour), which highlights the true activated region (blue contour). The ROI is somewhat affected by the noise in the data (difference between blue and magenta contours). The noise pushes some truly activated voxels below the threshold and lifts some nonactivated voxels above the threshold (white arrows). This can be interpreted as overfitting. The bar graph for the overfitted ROI (bottom right, same data as used for mapping) reflects the activation of the region during conditions A and B as well as the absence of activation during conditions C and D. However, in comparison to the true effects (left) it is substantially distorted by the selection contrast A-D (top). In particular, the contrast A-B (simulated to be zero) exhibits spurious significance (p<0.01). When we use independent data to define the ROI (green contour), no such distortion is observed (top right). For details on the simulation and analysis, see *Example 2: Regional activation analysis* in the text. (b) The simulation illustrates how data selection blends truth (left) and hypothesis (right) by distorting results (top) so as to better conform to the selection criterion.

**Fig. 4. A policy for noncircular analysis**
This flow diagram suggests a procedure for choosing an appropriate analysis that avoids the pitfalls of circularity. Considering the most common errors (bottom left, red letter references) can help recognize circularity in assessing a given analysis. The gist of the policy is as follows: We first consider performing a nonselective analysis only. If selective analysis is needed and we can demonstrate that the results are independent of the selection criterion under the null hypothesis, then all data are used for selective analysis. If we cannot demonstrate this, then a split-data analysis can serve to ensure independence. (For details, see Supplementary Information, *A policy for noncircular analysis*.)

See this image and copyright information in PMC

Comment in

Double-dipping revisited.
Button KS. Button KS. Nat Neurosci. 2019 May;22(5):688-690. doi: 10.1038/s41593-019-0398-z. Nat Neurosci. 2019. PMID: 31011228 No abstract available.

Cited by

Neural mechanisms of goal-directed behavior: outcome-based response selection is associated with increased functional coupling of the angular gyrus.
Zwosta K, Ruge H, Wolfensteller U. Zwosta K, et al. Front Hum Neurosci. 2015 Apr 10;9:180. doi: 10.3389/fnhum.2015.00180. eCollection 2015. Front Hum Neurosci. 2015. PMID: 25914635 Free PMC article.
Arithmetic skills are associated with left fronto-temporal gray matter volume in 536 children and adolescents.
Viesel-Nordmeyer N, Prado J. Viesel-Nordmeyer N, et al. NPJ Sci Learn. 2023 Dec 8;8(1):56. doi: 10.1038/s41539-023-00201-x. NPJ Sci Learn. 2023. PMID: 38065992 Free PMC article.
Statistical inferences under the Null hypothesis: common mistakes and pitfalls in neuroimaging studies.
Hupé JM. Hupé JM. Front Neurosci. 2015 Feb 19;9:18. doi: 10.3389/fnins.2015.00018. eCollection 2015. Front Neurosci. 2015. PMID: 25745383 Free PMC article.
Learning Cognitive Flexibility: Neural Substrates of Adapting Switch-Readiness to Time-varying Demands.
Sali AW, Bejjani C, Egner T. Sali AW, et al. J Cogn Neurosci. 2024 Feb 1;36(2):377-393. doi: 10.1162/jocn_a_02091. J Cogn Neurosci. 2024. PMID: 38010299 Free PMC article.
The Neural Substrates Underlying the Implementation of Phonological Rule in Lexical Tone Production: An fMRI Study of the Tone 3 Sandhi Phenomenon in Mandarin Chinese.
Chang CH, Kuo WJ. Chang CH, et al. PLoS One. 2016 Jul 25;11(7):e0159835. doi: 10.1371/journal.pone.0159835. eCollection 2016. PLoS One. 2016. PMID: 27455078 Free PMC article.

See all "Cited by" articles

References

1. Baker CI, Hutchison TL, Kanwisher N. Does the fusiform face area contain subregions highly selective for nonfaces? Nat Neurosci. 2007;10(1):3–4. - PubMed
1. Simmons WK, Matlis S, Bellgowan PS, Bodurka J, Barsalou LW, Martin A. Imaging the context-sensitivity of ventral temporal category representations using high-resolution fMRI. Society for Neuroscience Abstracts. 2006
1. Baker CI, Simmons WK, Bellgowan PS, Kriegeskorte N. Circular inference in neuroscience: The dangers of double dipping. Society for Neuroscience Abstracts. 2007 - PMC - PubMed
1. Vul E, Kanwisher N. Begging the question: The non-independence error in fMRI data analysis. In: Hanson S, Bunzl M, editors. Foundations and Philosophy for Neuroimaging. in press. To appear in.
1. Vul E, Harris C, Winkielman P, Pashler H. Perspectives on Psychological Science. Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition. in press. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

ZIA MH002909/ImNIH/Intramural NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
- The Lens - Patent Citations Database
Medical
- ClinicalTrials.gov

[1] Baker CI, Hutchison TL, Kanwisher N. Does the fusiform face area contain subregions highly selective for nonfaces? Nat Neurosci. 2007;10(1):3–4. - PubMed

[2] Baker CI, Hutchison TL, Kanwisher N. Does the fusiform face area contain subregions highly selective for nonfaces? Nat Neurosci. 2007;10(1):3–4. - PubMed

[3] Simmons WK, Matlis S, Bellgowan PS, Bodurka J, Barsalou LW, Martin A. Imaging the context-sensitivity of ventral temporal category representations using high-resolution fMRI. Society for Neuroscience Abstracts. 2006

[4] Simmons WK, Matlis S, Bellgowan PS, Bodurka J, Barsalou LW, Martin A. Imaging the context-sensitivity of ventral temporal category representations using high-resolution fMRI. Society for Neuroscience Abstracts. 2006

[5] Baker CI, Simmons WK, Bellgowan PS, Kriegeskorte N. Circular inference in neuroscience: The dangers of double dipping. Society for Neuroscience Abstracts. 2007 - PMC - PubMed

[6] Baker CI, Simmons WK, Bellgowan PS, Kriegeskorte N. Circular inference in neuroscience: The dangers of double dipping. Society for Neuroscience Abstracts. 2007 - PMC - PubMed

[7] Vul E, Kanwisher N. Begging the question: The non-independence error in fMRI data analysis. In: Hanson S, Bunzl M, editors. Foundations and Philosophy for Neuroimaging. in press. To appear in.

[8] Vul E, Kanwisher N. Begging the question: The non-independence error in fMRI data analysis. In: Hanson S, Bunzl M, editors. Foundations and Philosophy for Neuroimaging. in press. To appear in.

[9] Vul E, Harris C, Winkielman P, Pashler H. Perspectives on Psychological Science. Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition. in press. - PubMed

[10] Vul E, Harris C, Winkielman P, Pashler H. Perspectives on Psychological Science. Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition. in press. - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Circular analysis in systems neuroscience: the dangers of double dipping

Affiliation

Circular analysis in systems neuroscience: the dangers of double dipping

Authors

Affiliation

Abstract

Figures

Comment in

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Abstract

Figures

Comment in

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical