Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions
- PMID: 18782434
- PMCID: PMC2553421
- DOI: 10.1186/1471-2105-9-365
Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions
Abstract
Background: Epigenetics is the study of heritable changes in gene function that cannot be explained by changes in DNA sequence. One of the most commonly studied epigenetic alterations is cytosine methylation, which is a well recognized mechanism of epigenetic gene silencing and often occurs at tumor suppressor gene loci in human cancer. Arrays are now being used to study DNA methylation at a large number of loci; for example, the Illumina GoldenGate platform assesses DNA methylation at 1505 loci associated with over 800 cancer-related genes. Model-based cluster analysis is often used to identify DNA methylation subgroups in data, but it is unclear how to cluster DNA methylation data from arrays in a scalable and reliable manner.
Results: We propose a novel model-based recursive-partitioning algorithm to navigate clusters in a beta mixture model. We present simulations that show that the method is more reliable than competing nonparametric clustering approaches, and is at least as reliable as conventional mixture model methods. We also show that our proposed method is more computationally efficient than conventional mixture model approaches. We demonstrate our method on the normal tissue samples and show that the clusters are associated with tissue type as well as age.
Conclusion: Our proposed recursively-partitioned mixture model is an effective and computationally efficient method for clustering DNA methylation data.
Figures



Similar articles
-
A nonparametric Bayesian approach for clustering bisulfate-based DNA methylation profiles.BMC Genomics. 2012;13 Suppl 6(Suppl 6):S20. doi: 10.1186/1471-2164-13-S6-S20. Epub 2012 Oct 26. BMC Genomics. 2012. PMID: 23134689 Free PMC article.
-
A segmentation/clustering model for the analysis of array CGH data.Biometrics. 2007 Sep;63(3):758-66. doi: 10.1111/j.1541-0420.2006.00729.x. Biometrics. 2007. PMID: 17825008
-
A new clustering method for microarray data analysis.Proc IEEE Comput Soc Bioinform Conf. 2002;1:268-75. Proc IEEE Comput Soc Bioinform Conf. 2002. PMID: 15838143
-
Monitoring methylation changes in cancer.Adv Biochem Eng Biotechnol. 2007;104:1-11. doi: 10.1007/10_024. Adv Biochem Eng Biotechnol. 2007. PMID: 17290816 Review.
-
Methods of DNA methylation analysis.Curr Opin Clin Nutr Metab Care. 2007 Sep;10(5):576-81. doi: 10.1097/MCO.0b013e3282bf6f43. Curr Opin Clin Nutr Metab Care. 2007. PMID: 17693740 Review.
Cited by
-
Analysing and interpreting DNA methylation data.Nat Rev Genet. 2012 Oct;13(10):705-19. doi: 10.1038/nrg3273. Nat Rev Genet. 2012. PMID: 22986265 Review.
-
Comprehensive human cell-type methylation atlas reveals origins of circulating cell-free DNA in health and disease.Nat Commun. 2018 Nov 29;9(1):5068. doi: 10.1038/s41467-018-07466-6. Nat Commun. 2018. PMID: 30498206 Free PMC article.
-
An Intelligent Classification System for Cancer Detection Based on DNA Methylation Using ML and Semantic Knowledge in Healthcare.Comput Intell Neurosci. 2022 Oct 10;2022:4334852. doi: 10.1155/2022/4334852. eCollection 2022. Comput Intell Neurosci. 2022. PMID: 38501034 Free PMC article.
-
Analysis of the association between CIMP and BRAF in colorectal cancer by DNA methylation profiling.PLoS One. 2009 Dec 21;4(12):e8357. doi: 10.1371/journal.pone.0008357. PLoS One. 2009. PMID: 20027224 Free PMC article.
-
Penalized logistic regression for high-dimensional DNA methylation data with case-control studies.Bioinformatics. 2012 May 15;28(10):1368-75. doi: 10.1093/bioinformatics/bts145. Epub 2012 Mar 30. Bioinformatics. 2012. PMID: 22467913 Free PMC article.
References
-
- Russo V, Martienssen RA, Riggs AD. Epigenetic mechanisms of gene regulation. Cold Spring Harbor Laboratory Press; 1996.
-
- Sakamoto H, Suzuki M, Abe T, Hosoyama T, Himeno E, Tanaka S, Greally JM, Hattori N, Yagi S, Shiota K. Cell type-specific methylation profiles occurring disproportionately in CpG-less regions that delineate developmental similarity. Genes Cells. 2007;12:1123–1132. doi: 10.1111/j.1365-2443.2007.01120.x. - DOI - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- CA126939/CA/NCI NIH HHS/United States
- R01 CA048061/CA/NCI NIH HHS/United States
- P42ES05947/ES/NIEHS NIH HHS/United States
- CA48061/CA/NCI NIH HHS/United States
- CA078609/CA/NCI NIH HHS/United States
- P42 ES005947/ES/NIEHS NIH HHS/United States
- CA121147/CA/NCI NIH HHS/United States
- R01 CA100679/CA/NCI NIH HHS/United States
- R01 CA126939/CA/NCI NIH HHS/United States
- T32 ES007155/ES/NIEHS NIH HHS/United States
- R01 CA078609/CA/NCI NIH HHS/United States
- R01 CA121147/CA/NCI NIH HHS/United States
- T32ES007155/ES/NIEHS NIH HHS/United States
- CA100679/CA/NCI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources