Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Oct;17(10):1537-45.
doi: 10.1101/gr.6202607. Epub 2007 Sep 4.

A systems biology approach for pathway level analysis

Affiliations

A systems biology approach for pathway level analysis

Sorin Draghici et al. Genome Res. 2007 Oct.

Abstract

A common challenge in the analysis of genomics data is trying to understand the underlying phenomenon in the context of all complex interactions taking place on various signaling pathways. A statistical approach using various models is universally used to identify the most relevant pathways in a given experiment. Here, we show that the existing pathway analysis methods fail to take into consideration important biological aspects and may provide incorrect results in certain situations. By using a systems biology approach, we developed an impact analysis that includes the classical statistics but also considers other crucial factors such as the magnitude of each gene's expression change, their type and position in the given pathways, their interactions, etc. The impact analysis is an attempt to a deeper level of statistical analysis, informed by more pathway-specific biology than the existing techniques. On several illustrative data sets, the classical analysis produces both false positives and false negatives, while the impact analysis provides biologically meaningful results. This analysis method has been implemented as a Web-based tool, Pathway-Express, freely available as part of the Onto-Tools (http://vortex.cs.wayne.edu).

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The complement and coagulation cascade as affected by treatment with palmitate in a hepatic cell line. There are seven differentially expressed genes (red, up-regulated; blue, down-regulated) out of 69 total genes. All classical ORA models would give any other pathway with the same proportion of genes a similar P-value, disregarding the fact that six out of these seven genes are involved in the same region of the pathway, closely interacting with each other. Both ORA and GSEA would yield exactly the same significance value to this pathway even if the diagram were to be completely redesigned by future discoveries. In contrast, the impact factor can distinguish between this pathway and any other pathway with the same proportion of differentially expressed gene, as well as take into account any future changes to the topology of the pathway.
Figure 2.
Figure 2.
A comparison between the results of the classical probabilistic approaches (A, hypergeometric; B, GSEA) and the results of the pathway impact analysis (C) for a set of genes differentially expressed in lung adenocarcinoma. The pathways marked with green are considered most likely to be linked to this condition in this experiment. The ones in red are unlikely to be related. The ranking of the pathways produced by the classical approaches is very misleading. According to the hypergeometric model, the most significant pathways in this condition are: prion disease, focal adhesion, and Parkinson’s disease. Two out of these three are likely to be incorrect. GSEA yields cell cycle as the most enriched pathway in cancer, but three out of the four subsequent pathways are clearly incorrect. In contrast, all three top pathways identified by the impact analysis are relevant to the given condition. The impact analysis is also superior from a statistical perspective. According to both hypergeometric and GSEA, no pathway is significant at the usual 1% or 5% levels on corrected P-values. In contrast, according to the impact analysis, the cell cycle is significant at 1%, and focal adhesion and Wnt signaling are significant at 5% and 10%, respectively.
Figure 3.
Figure 3.
The focal adhesion pathway as impacted in lung adenocarcinoma vs. normal. In this condition, both ITG and RTK receptors are perturbed, as well as the VEGF ligand. Because these three genes appear at the very beginning and affect both entry points controlling this pathway, their perturbations are widely propagated throughout the pathway and this pathway appears as highly impacted. All classical approaches completely ignore the positions of the genes on the given pathways and fail to identify this pathway as significant.
Figure 4.
Figure 4.
A comparison between the results of the classical (ORA) probabilistic approach (A), GSEA (B), and the results of the pathway impact analysis (C) for a set of genes associated with poor prognosis in breast cancer. The pathways marked with green are well supported by the existing literature. The ones in red are unlikely to be related. After correcting for multiple comparisons, GSEA fails to identify any pathway as significantly impacted in this condition at any of the usual significance levels (1%, 5%, or 10%). The hypergeometric model pinpoints cell cycle as the only significant pathway. Relevant pathways such as focal adhesion, TGF-beta signaling, and MAPK do not appear as significant from a hypergeometric point of view. While agreeing on the cell cycle, the impact analysis also identifies the three other relevant pathways as significant at the 5% level.
Figure 5.
Figure 5.
A comparison between the results of the classical probabilistic approach (A) and the results of the impact analysis (B) for a set of genes found to be differentially expressed in a hepatic cell line treated with palmitate. Green pathways are well supported by literature evidence, while red pathways are unlikely to be relevant. The classical statistical analysis yields three pathways significant at the 5% level: complement and coagulation cascades, focal adhesion, and MAPK. The impact analysis agrees on these three pathways but also identifies several additional pathways. Among these, tight junction is well supported by the literature.

Similar articles

Cited by

References

    1. Beer D.G., Kardia S.L., Huang C.-C., Giordano T.J., Levin A.M., Misek D.E., Lin L., Chen G., Gharib T.G., Thomas D.G., Kardia S.L., Huang C.-C., Giordano T.J., Levin A.M., Misek D.E., Lin L., Chen G., Gharib T.G., Thomas D.G., Huang C.-C., Giordano T.J., Levin A.M., Misek D.E., Lin L., Chen G., Gharib T.G., Thomas D.G., Giordano T.J., Levin A.M., Misek D.E., Lin L., Chen G., Gharib T.G., Thomas D.G., Levin A.M., Misek D.E., Lin L., Chen G., Gharib T.G., Thomas D.G., Misek D.E., Lin L., Chen G., Gharib T.G., Thomas D.G., Lin L., Chen G., Gharib T.G., Thomas D.G., Chen G., Gharib T.G., Thomas D.G., Gharib T.G., Thomas D.G., Thomas D.G., et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 2002;8:816–824. - PubMed
    1. Beviglia L., Golubovskaya V., Xu L., Yang X., Craven R.J., Cance W.G., Golubovskaya V., Xu L., Yang X., Craven R.J., Cance W.G., Xu L., Yang X., Craven R.J., Cance W.G., Yang X., Craven R.J., Cance W.G., Craven R.J., Cance W.G., Cance W.G. Focal adhesion kinase N-terminus in breast carcinoma cells induces rounding, detachment and apoptosis. Biochem. J. 2003;373:201–210. - PMC - PubMed
    1. Breslin T., Krogh M., Peterson C., Troein C., Krogh M., Peterson C., Troein C., Peterson C., Troein C., Troein C. Signal transduction pathway profiling of individual tumor samples. BMC Bioinformatics. 2005;6:163. doi: 10.1186/1471-2105-6-163. - DOI - PMC - PubMed
    1. Canales R.D., Luo Y., Willey J.C., Austermiller B., Barbacioru C.C., Boysen C., Hunkapiller K., Jensen R.V., Knight C.R., Lee K.Y., Luo Y., Willey J.C., Austermiller B., Barbacioru C.C., Boysen C., Hunkapiller K., Jensen R.V., Knight C.R., Lee K.Y., Willey J.C., Austermiller B., Barbacioru C.C., Boysen C., Hunkapiller K., Jensen R.V., Knight C.R., Lee K.Y., Austermiller B., Barbacioru C.C., Boysen C., Hunkapiller K., Jensen R.V., Knight C.R., Lee K.Y., Barbacioru C.C., Boysen C., Hunkapiller K., Jensen R.V., Knight C.R., Lee K.Y., Boysen C., Hunkapiller K., Jensen R.V., Knight C.R., Lee K.Y., Hunkapiller K., Jensen R.V., Knight C.R., Lee K.Y., Jensen R.V., Knight C.R., Lee K.Y., Knight C.R., Lee K.Y., Lee K.Y., et al. Evaluation of DNA microarray results with quantitative gene expression platforms. Nat. Biotechnol. 2006;24:1115–1122. - PubMed
    1. Chen Z., Gibson T.B., Robinson F., Silvestro L., Pearson G., Xu B., Wright A., Vanderbilt C., Cobb M.H., Gibson T.B., Robinson F., Silvestro L., Pearson G., Xu B., Wright A., Vanderbilt C., Cobb M.H., Robinson F., Silvestro L., Pearson G., Xu B., Wright A., Vanderbilt C., Cobb M.H., Silvestro L., Pearson G., Xu B., Wright A., Vanderbilt C., Cobb M.H., Pearson G., Xu B., Wright A., Vanderbilt C., Cobb M.H., Xu B., Wright A., Vanderbilt C., Cobb M.H., Wright A., Vanderbilt C., Cobb M.H., Vanderbilt C., Cobb M.H., Cobb M.H. MAP kinases. Chem. Rev. 2001;101:2449–2476. - PubMed

Publication types

MeSH terms

LinkOut - more resources