Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 30;14(1):17603.
doi: 10.1038/s41598-024-68198-w.

Identification and immune landscape of sarcopenia-related molecular clusters in inflammatory bowel disease by machine learning and integrated bioinformatics

Affiliations

Identification and immune landscape of sarcopenia-related molecular clusters in inflammatory bowel disease by machine learning and integrated bioinformatics

Chongkang Yue et al. Sci Rep. .

Abstract

Sarcopenia, a prevalent comorbidity of inflammatory bowel disease (IBD), is characterized by diminished skeletal muscle mass and strength. Nevertheless, the underlying interconnected mechanisms remain elusive. This study identified distinct expression patterns of sarcopenia-associated genes (SRGs) across individuals with IBD and in samples of normal tissue. By analyzing SRG expression profiles, we effectively segregated 541 IBD samples into three distinct clusters, each marked by its unique immune landscape. To unravel the transcriptional disruptions underlying these clusters, the Weighted Gene Co-expression Network Analysis (WGCNA) algorithm was employed to spotlight key genes linked to each cluster. A diagnostic model based on four key genes (TIMP1, PLAU, PHLDA1, TGFBI) was established using Random Forest and LASSO (least absolute shrinkage and selection operator) algorithms, and validated with the GSE179285 dataset. Moreover, the GSE112366 dataset facilitated the exploration of gene expression dynamics within the ileum mucosa of UC patients pre- and post-Ustekinumab treatment. Additionally, insights into the intricate relationship between immune cells and these pivotal genes were gleaned from the single-cell RNA dataset GSE162335. In conclusion, our findings collectively underscored the pivotal role of sarcopenia-related genes in the pathogenesis of IBD. Their potential as robust biomarkers for future diagnostic and therapeutic strategies is particularly promising, opening avenues for a deeper understanding and improved management of these interconnected conditions.

Keywords: Bioinformatics analysis; Inflammatory bowel disease; Machine learning; Sarcopenia; Single-cell RNA analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Overview procedure of the research.
Figure 2
Figure 2
Identification of DESRGs in IBD. (A and B) Using p < 0.01 as the criterion, 87 DESRGs were selected. Boxplots were used to visualize the expression pattern of DSERGs in normal and IBD tissues. (C) The location of 87 DESRGs on chromosomes. (D) Correlation analysis of 87 DESRGs. Red and green colors represented positive and negative correlations, respectively. The correlation coefficients were marked with the area of the pie chart.
Figure 3
Figure 3
Enrichment analyses of DESRGs and assessment of immune landscape in patients with IBD. Dot plots showed the results of GO enrichment analysis (A) and KEGG pathway enrichment analysis (B) of DESRGs. (C) DO enrichment analysis of DESRGs. (D) The correlation between infiltrating immune cells and DESRGs. Red to blue indicateed the change from high to low correlation. The asterisks represented the statistical p value (*P < 0.05; **P < 0.01; ***P < 0.001). (E) Boxplot showed the immune cell infiltration difference between the normal and IBD groups. Blue represented the normal group; red represented the IBD group.
Figure 4
Figure 4
Identification of IBD clusters mediated by DESRGs. (A) NMF clustering analysis of 541 IBD samples based on 87 DESRGs. PCA (B) and t-SNE (C) analyses for the transcriptome profiles of IBD clusters, showing a remarkable difference on transcriptome among different clusters. (D) Heatmap showed the expression of DESRGs in diverse clusters. Red represented high expression of related genes; blue represented low expression of related genes. (E) Boxplot showed the immune cell infiltration difference among different clusters. Blue represented the C1; red represented the C2; yellow represented the C3. The asterisks represented the statistical p value (*P < 0.05; **P < 0.01; ***P < 0.001). (FH) GSVA enrichment among different IBD clusters. The heatmap was used to visualize these biological processes. Red represented activated pathways and blue represented inhibited pathways (F) IBD cluster 1 versus IBD cluster 2. (G) IBD cluster 1 versus IBD cluster 3. (H) IBD cluster 2 versus IBD cluster 3.
Figure 5
Figure 5
Screening key gene modules by WGCNA. (A) Module–trait relationships in UC. Each row represented a module; each column represented a clinical status. (B) Scatter plot between module membership in green module and the gene significance for UC. 71 key genes were selected in brown module with GS > 0.4 and MS > 0.8 as criteria. (C) Module–trait relationships in CD. Each cell contained the corresponding correlation and p-value. (D) Scatter plot between module membership in green module and the gene significance for CD. 143 key genes were selected in green module with GS > 0.4 and MS > 0.8 as criteria. (E) Module–trait relationships in IBD clusters mediated by DESRGs. Each cell contained the corresponding correlation and p-value. (F) Scatter plot between module membership in blue module and the gene significance for C2. (G) Venn diagram was used to visualize 11 common key genes.
Figure 6
Figure 6
Screening characteristic genes by machine learning algorithms. (A) Identification of the optimal penalization coefficient lambda (λ) in the LASSO model with tenfold cross-validation. (B) LASSO coefficient profiles of 11 key genes. (C) The impact of the quantity of decision trees on the rate of inaccuracies. The horizontal axis displayed the number of decision trees, while the vertical axis denoted the inaccuracy rate. (D) The importance of 11 key genes identified by RF. (E) The Venn diagram was used to visualize the four common genes obtained through machine learning algorithms and differential analysis.
Figure 7
Figure 7
Construction and validation of the diagnostic model. (A) ROC curve for diagnosing IBD in the training cohort. (B) ROC curve for diagnosing IBD in the internal validation cohort. (C) The ROC curve was employed for the diagnosis of IBD in the external validation cohort. (D) Nomogram to predict the diagnostic value of the 4 model genes for IBD. (E) Calibration curves in the training set. The x-axis represented the predicted probability from the nomogram, and the y-axis was the actual probability of IBD. (F) DCA in the training set. The y-axis represented net benefits, calculated by subtracting the relative harm (false positives) from the benefits (true positives). The x-axis calculated the threshold probability. (G) Clinical impact curves of nomogram. The y-axis represented the number of people with high risk. The x-axis calculated the threshold probability. The red lines represented the number of individuals identified as high risk (IBD) by the model at the corresponding probability threshold. The blue lines represented the number of individuals who, at that same probability threshold, were classified by the model as high risk and actually experienced an outcome event (IBD).
Figure 8
Figure 8
Ustekinumab may treat CD by regulating the expression of sarcopenia-related key genes. (AD) The relative expression levels of 4 model genes in the terminal ileum mucosa of normal, control, pre-Ust (CD patients before 8 weeks of Ustekinumab induction therapy) and after-Ust (CD patients after 8 weeks of Ustekinumab induction therapy). (B) The relative expression levels of 4 model genes in the terminal ileum mucosa of normal, control, pre-Ust (CD patients before 44 weeks of Ustekinumab maintenance therapy), after-Ust (CD patients after 44 weeks of Ustekinumab maintenance therapy). The asterisks represented the statistical p value (*P < 0.05; **P < 0.01; ***P < 0.001).
Figure 9
Figure 9
scRNA analysis of colon lamina propria immune cells in patients with active UC. (A) The violin diagram was used to visualize the genes features, counts, and mitochondrial gene percentages of 11 inflamed UC samples. (B) The gene scatter plot with the top 10 high variable genes. The red dots represented high variable genes, and the black dots represented other genes. (C) Cell annotation results for different immune cell types. (D) The dot plot was used to visualize the relative expression of 4 key model genes in different immune cells.

Similar articles

References

    1. Abraham, C. & Cho, J. H. Inflammatory bowel disease. N Engl. J. Med.361(21), 2066–2078 (2009). 10.1056/NEJMra0804647 - DOI - PMC - PubMed
    1. Kaplan, G. G. The global burden of IBD: from 2015 to 2025. Nat. Rev. Gastroenterol. Hepatol.12(12), 720–727 (2015). 10.1038/nrgastro.2015.150 - DOI - PubMed
    1. Ng, S. C. et al. Worldwide incidence and prevalence of inflammatory bowel disease in the 21st century: a systematic review of population-based studies. Lancet390(10114), 2769–2778 (2017). 10.1016/S0140-6736(17)32448-0 - DOI - PubMed
    1. Seyedian, S. S., Nokhostin, F. & Malamir, M. D. A review of the diagnosis, prevention, and treatment methods of inflammatory bowel disease. J. Med. Life12(2), 113–122 (2019). 10.25122/jml-2018-0075 - DOI - PMC - PubMed
    1. Baumgart, D. C. & Carding, S. R. Inflammatory bowel disease: cause and immunobiology. Lancet369(9573), 1627–1640 (2007). 10.1016/S0140-6736(07)60750-8 - DOI - PubMed

LinkOut - more resources