Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data
- PMID: 31455416
- PMCID: PMC6712867
- DOI: 10.1186/s13148-019-0717-y
Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data
Abstract
Background: Umbilical cord blood (UCB) is commonly used in epigenome-wide association studies of prenatal exposures. Accounting for cell type composition is critical in such studies as it reduces confounding due to the cell specificity of DNA methylation (DNAm). In the absence of cell sorting information, statistical methods can be applied to deconvolve heterogeneous cell mixtures. Among these methods, reference-based approaches leverage age-appropriate cell-specific DNAm profiles to estimate cellular composition. In UCB, four reference datasets comprising DNAm signatures profiled in purified cell populations have been published using the Illumina 450 K and EPIC arrays. These datasets are biologically and technically different, and currently, there is no consensus on how to best apply them. Here, we systematically evaluate and compare these datasets and provide recommendations for reference-based UCB deconvolution.
Results: We first evaluated the four reference datasets to ascertain both the purity of the samples and the potential cell cross-contamination. We filtered samples and combined datasets to obtain a joint UCB reference. We selected deconvolution libraries using two different approaches: automatic selection using the top differentially methylated probes from the function pickCompProbes in minfi and a standardized library selected using the IDOL (Identifying Optimal Libraries) iterative algorithm. We compared the performance of each reference separately and in combination, using the two approaches for reference library selection, and validated the results in an independent cohort (Generation R Study, n = 191) with matched Fluorescence-Activated Cell Sorting measured cell counts. Strict filtering and combination of the references significantly improved the accuracy and efficiency of cell type estimates. Ultimately, the IDOL library outperformed the library from the automatic selection method implemented in pickCompProbes.
Conclusion: These results have important implications for epigenetic studies in UCB as implementing this method will optimally reduce confounding due to cellular heterogeneity. This work provides guidelines for future reference-based UCB deconvolution and establishes a framework for combining reference datasets in other tissues.
Keywords: Cell type heterogeneity; DNAm; Deconvolution; IDOL; Reference dataset; Umbilical cord blood; minfi; pickCompProbes.
Conflict of interest statement
KTK and JKW are founders of Celintec, which provided no funding and had no role in this work. The other authors declare that they have no competing interests.
Figures






Similar articles
-
A Novel Framework for the Identification of Reference DNA Methylation Libraries for Reference-Based Deconvolution of Cellular Mixtures.Front Bioinform. 2022;2:835591. doi: 10.3389/fbinf.2022.835591. Epub 2022 Mar 21. Front Bioinform. 2022. PMID: 35419567 Free PMC article.
-
Improving cell mixture deconvolution by identifying optimal DNA methylation libraries (IDOL).BMC Bioinformatics. 2016 Mar 8;17:120. doi: 10.1186/s12859-016-0943-7. BMC Bioinformatics. 2016. PMID: 26956433 Free PMC article.
-
Cell-type deconvolution from DNA methylation: a review of recent applications.Hum Mol Genet. 2017 Oct 1;26(R2):R216-R224. doi: 10.1093/hmg/ddx275. Hum Mol Genet. 2017. PMID: 28977446 Free PMC article. Review.
-
Detecting cord blood cell type-specific epigenetic associations with gestational diabetes mellitus and early childhood growth.Clin Epigenetics. 2021 Jun 26;13(1):131. doi: 10.1186/s13148-021-01114-5. Clin Epigenetics. 2021. PMID: 34174944 Free PMC article.
-
Computational deconvolution of DNA methylation data from mixed DNA samples.Brief Bioinform. 2024 Mar 27;25(3):bbae234. doi: 10.1093/bib/bbae234. Brief Bioinform. 2024. PMID: 38762790 Free PMC article. Review.
Cited by
-
A transdisciplinary approach to understand the epigenetic basis of race/ethnicity health disparities.Epigenomics. 2021 Nov;13(21):1761-1770. doi: 10.2217/epi-2020-0080. Epub 2021 Mar 10. Epigenomics. 2021. PMID: 33719520 Free PMC article.
-
Prenatal stress and gestational epigenetic age: No evidence of associations based on a large prospective multi-cohort study.Res Sq [Preprint]. 2024 Jul 4:rs.3.rs-4257223. doi: 10.21203/rs.3.rs-4257223/v1. Res Sq. 2024. PMID: 39011115 Free PMC article. Preprint.
-
Periconceptional folate intake influences DNA methylation at birth based on dietary source in an analysis of pediatric acute lymphoblastic leukemia cases and controls.Am J Clin Nutr. 2022 Dec 19;116(6):1553-1564. doi: 10.1093/ajcn/nqac283. Am J Clin Nutr. 2022. PMID: 36178055 Free PMC article.
-
Ambient air pollution during pregnancy and DNA methylation in umbilical cord blood, with potential mediation of associations with infant adiposity: The Healthy Start study.Environ Res. 2022 Nov;214(Pt 1):113881. doi: 10.1016/j.envres.2022.113881. Epub 2022 Jul 11. Environ Res. 2022. PMID: 35835166 Free PMC article.
-
Maternal smoking DNA methylation risk score associated with health outcomes in offspring of European and South Asian ancestry.Elife. 2024 Aug 14;13:RP93260. doi: 10.7554/eLife.93260. Elife. 2024. PMID: 39141540 Free PMC article.