Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan 4;45(D1):D200-D203.
doi: 10.1093/nar/gkw1129. Epub 2016 Nov 29.

CDD/SPARCLE: functional classification of proteins via subfamily domain architectures

Affiliations

CDD/SPARCLE: functional classification of proteins via subfamily domain architectures

Aron Marchler-Bauer et al. Nucleic Acids Res. .

Abstract

NCBI's Conserved Domain Database (CDD) aims at annotating biomolecular sequences with the location of evolutionarily conserved protein domain footprints, and functional sites inferred from such footprints. An archive of pre-computed domain annotation is maintained for proteins tracked by NCBI's Entrez database, and live search services are offered as well. CDD curation staff supplements a comprehensive collection of protein domain and protein family models, which have been imported from external providers, with representations of selected domain families that are curated in-house and organized into hierarchical classifications of functionally distinct families and sub-families. CDD also supports comparative analyses of protein families via conserved domain architectures, and a recent curation effort focuses on providing functional characterizations of distinct subfamily architectures using SPARCLE: Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
CD-Search reporting pre-computed domain annotation for the protein with GenBank accession KUG45846, a hypothetical protein from Pseudomonas savastanoi pv. Fraxini. The section circled in red provides the functional label that has been assigned to the subfamily domain architecture characterized by the string ‘cd00714 cd05008 cd05009’, which is shared by over 70 000 sequences in Entrez/protein.
Figure 2.
Figure 2.
Subfamily domain architecture summary page. The summary pages include a browser that provides options for retrieving sub-sets of the sequences sharing the same subfamily architecture, such as those from particular sources, a particular organism, or those that are linked to papers in PubMed.

Similar articles

  • NCBI's Conserved Domain Database and Tools for Protein Domain Analysis.
    Yang M, Derbyshire MK, Yamashita RA, Marchler-Bauer A. Yang M, et al. Curr Protoc Bioinformatics. 2020 Mar;69(1):e90. doi: 10.1002/cpbi.90. Curr Protoc Bioinformatics. 2020. PMID: 31851420 Free PMC article.
  • CDD/SPARCLE: the conserved domain database in 2020.
    Lu S, Wang J, Chitsaz F, Derbyshire MK, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Marchler GH, Song JS, Thanki N, Yamashita RA, Yang M, Zhang D, Zheng C, Lanczycki CJ, Marchler-Bauer A. Lu S, et al. Nucleic Acids Res. 2020 Jan 8;48(D1):D265-D268. doi: 10.1093/nar/gkz991. Nucleic Acids Res. 2020. PMID: 31777944 Free PMC article.
  • CDD: NCBI's conserved domain database.
    Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, Geer RC, He J, Gwadz M, Hurwitz DI, Lanczycki CJ, Lu F, Marchler GH, Song JS, Thanki N, Wang Z, Yamashita RA, Zhang D, Zheng C, Bryant SH. Marchler-Bauer A, et al. Nucleic Acids Res. 2015 Jan;43(Database issue):D222-6. doi: 10.1093/nar/gku1221. Epub 2014 Nov 20. Nucleic Acids Res. 2015. PMID: 25414356 Free PMC article.
  • Identification of motifs in protein sequences.
    Sonnhammer EL, Wolfsberg TG. Sonnhammer EL, et al. Curr Protoc Cell Biol. 2001 May;Appendix 1:Appendix 1C. doi: 10.1002/0471143030.cba01cs00. Curr Protoc Cell Biol. 2001. PMID: 18228275 Review.
  • Coverage of protein domain families with structural protein-protein interactions: current progress and future trends.
    Goncearenco A, Shoemaker BA, Zhang D, Sarychev A, Panchenko AR. Goncearenco A, et al. Prog Biophys Mol Biol. 2014 Nov-Dec;116(2-3):187-93. doi: 10.1016/j.pbiomolbio.2014.05.005. Epub 2014 Jun 13. Prog Biophys Mol Biol. 2014. PMID: 24931138 Free PMC article. Review.

Cited by

References

    1. Finn R.D., Coggill P., Eberhardt R.Y., Eddy S.R., Mistry J., Mitchell A.L., Potter S.C., Punta M., Qureshi M., Sangrador-Vegas A., et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44:D279–D285. - PMC - PubMed
    1. Letunic I., Doerks T., Bork P. SMART: recent updates, new developments, and status in 2015. Nucleic Acids Res. 2014;43:D257–D260. - PMC - PubMed
    1. Tatusov R.L., Natale D.A., Garkavtsev I.V., Tatusova T.A., Shankavaram U.T., Rao B.S., Kiryutin B., Galperin M.Y., Fedorova N.D., Koonin E.V. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001;29:22–28. - PMC - PubMed
    1. Haft D.H., Selengut J.D., Richter A.R., Harkins D., Basu M.K., Beck E. TIGRFAMs and genome properties in 2013. Nucleic Acids Res. 2013;41:D387–D395. - PMC - PubMed
    1. Klimke W., Agarwala R., Badretdin A., Chetvernin S., Ciufo S., Fedorov B., Kiryutin B., O'Neill K., Resch W., Resenchuk S., et al. The National Center for Biotechnology Information's protein clusters database. Nucleic Acids Res. 2009;37:D216–D223. - PMC - PubMed

Publication types