Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences

Abstract

Profiling phylogenetic marker genes, such as the 16S rRNA gene, is a key tool for studies of microbial communities but does not provide direct evidence of a community's functional capabilities. Here we describe PICRUSt (phylogenetic investigation of communities by reconstruction of unobserved states), a computational approach to predict the functional composition of a metagenome using marker gene data and a database of reference genomes. PICRUSt uses an extended ancestral-state reconstruction algorithm to predict which gene families are present and then combines gene families to estimate the composite metagenome. Using 16S information, PICRUSt recaptures key findings from the Human Microbiome Project and accurately predicts the abundance of gene families in host-associated and environmental communities, with quantifiable uncertainty. Our results demonstrate that phylogeny and function are sufficiently linked that this 'predictive metagenomic' approach should provide useful insights into the thousands of uncultivated microbial communities for which only marker gene surveys are currently available.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: The PICRUSt workflow.
Figure 2: PICRUSt recapitulates biological findings from the Human Microbiome Project.
Figure 3: PICRUSt accuracy across various environmental microbiomes.
Figure 4: Accuracy of PICRUSt prediction compared with shotgun metagenomic sequencing at shallow sequencing depths.
Figure 5: PICRUSt prediction accuracy across the tree of bacterial and archaeal genomes.
Figure 6: Variation in inference accuracy across functional modules within single genomes.

Similar content being viewed by others

References

  1. Cho, I. & Blaser, M.J. The human microbiome: at the interface of health and disease. Nat. Rev. Genet. 13, 260–270 (2012).

    Article  CAS  Google Scholar 

  2. Suen, G. et al. An insect herbivore microbiome with high plant biomass-degrading capacity. PLoS Genet. 6, e1001129 (2010).

    Article  Google Scholar 

  3. Kuczynski, J. et al. Direct sequencing of the human microbiome readily reveals community differences. Genome Biol. 11, 210 (2010).

    Article  Google Scholar 

  4. Parks, D.H. & Beiko, R.G. Measures of phylogenetic differentiation provide robust and complementary insights into microbial communities. ISME J. 7, 173–183 (2013).

    Article  CAS  Google Scholar 

  5. Knight, R. et al. Unlocking the potential of metagenomics through replicated experimental design. Nat. Biotechnol. 30, 513–520 (2012).

    Article  CAS  Google Scholar 

  6. Segata, N. & Huttenhower, C. Toward an efficient method of identifying core genes for evolutionary and functional microbial phylogenies. PLoS ONE 6, e24704 (2011).

    Article  CAS  Google Scholar 

  7. Snel, B., Bork, P. & Huynen, M.A. Genome phylogeny based on gene content. Nat. Genet. 21, 108–110 (1999).

    Article  CAS  Google Scholar 

  8. Konstantinidis, K.T. & Tiedje, J.M. Genomic insights that advance the species definition for prokaryotes. Proc. Natl. Acad. Sci. USA 102, 2567–2572 (2005).

    Article  CAS  Google Scholar 

  9. Zaneveld, J.R., Lozupone, C., Gordon, J.I. & Knight, R. Ribosomal RNA diversity predicts genome diversity in gut bacteria and their relatives. Nucleic Acids Res. 38, 3869–3879 (2010).

    Article  CAS  Google Scholar 

  10. Xu, J. et al. Evolution of symbiotic bacteria in the distal human intestine. PLoS Biol. 5, e156 (2007).

    Article  Google Scholar 

  11. Collins, R.E. & Higgs, P.G. Testing the infinitely many genes model for the evolution of the bacterial core genome and pangenome. Mol. Biol. Evol. 29, 3413–3425 (2012).

    Article  CAS  Google Scholar 

  12. Martiny, A.C., Treseder, K. & Pusch, G. Phylogenetic conservatism of functional traits in microorganisms. ISME J. 7, 830–838 (2013).

    Article  CAS  Google Scholar 

  13. Morgan, X.C. et al. Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment. Genome Biol. 13, R79 (2012).

    Article  CAS  Google Scholar 

  14. Muegge, B.D. et al. Diet drives convergence in gut microbiome functions across mammalian phylogeny and within humans. Science 332, 970–974 (2011).

    Article  CAS  Google Scholar 

  15. Barott, K.L. et al. Microbial to reef scale interactions between the reef-building coral Montastraea annularis and benthic algae. Proc. Biol. Sci. 279, 1655–1664 (2012).

    Article  Google Scholar 

  16. Chaffron, S., Rehrauer, H., Pernthaler, J. & von Mering, C. A global network of coexisting microbes from environmental and whole-genome sequence data. Genome Res. 20, 947–959 (2010).

    Article  CAS  Google Scholar 

  17. Kembel, S.W., Wu, M., Eisen, J.A. & Green, J.L. Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance. PLoS Comput. Biol. 8, e1002743 (2012).

    Article  CAS  Google Scholar 

  18. Smillie, C.S. et al. Ecology drives a global network of gene exchange connecting the human microbiome. Nature 480, 241–244 (2011).

    Article  CAS  Google Scholar 

  19. Meehan, C.J. & Beiko, R.G. Lateral gene transfer of an ABC transporter complex between major constituents of the human gut microbiome. BMC Microbiol. 12, 248 (2012).

    Article  CAS  Google Scholar 

  20. Boucher, Y. et al. Lateral gene transfer and the origins of prokaryotic groups. Annu. Rev. Genet. 37, 283–328 (2003).

    Article  CAS  Google Scholar 

  21. Hemme, C.L. et al. Metagenomic insights into evolution of a heavy metal-contaminated groundwater microbial community. ISME J. 4, 660–672 (2010).

    Article  CAS  Google Scholar 

  22. The Human Microbiome Project Consortium. Structure, function and diversity of the healthy human microbiome. Nature 486, 207–214 (2012).

  23. Fierer, N. et al. Cross-biome metagenomic analyses of soil microbial communities and their functional attributes. Proc. Natl. Acad. Sci. USA 109, 21390–21395 (2012).

    Article  CAS  Google Scholar 

  24. Harris, J.K. et al. Phylogenetic stratigraphy in the Guerrero Negro hypersaline microbial mat. ISME J. 7, 50–60 (2013).

    Article  Google Scholar 

  25. Kunin, V. et al. Millimeter-scale genetic gradients and community-level molecular convergence in a hypersaline microbial mat. Mol. Syst. Biol. 4, 198 (2008).

    Article  Google Scholar 

  26. Markowitz, V.M. et al. IMG: the Integrated Microbial Genomes database and comparative analysis system. Nucleic Acids Res. 40, D115–D122 (2012).

    Article  CAS  Google Scholar 

  27. Kanehisa, M., Goto, S., Sato, Y., Furumichi, M. & Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109–D114 (2012).

    Article  CAS  Google Scholar 

  28. Tatusov, R.L., Koonin, E.V. & Lipman, D.J. A genomic perspective on protein families. Science 278, 631–637 (1997).

    Article  CAS  Google Scholar 

  29. DeSantis, T.Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006).

    Article  CAS  Google Scholar 

  30. Caporaso, J.G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).

    Article  CAS  Google Scholar 

  31. Abubucker, S. et al. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput. Biol. 8, e1002358 (2012).

    Article  CAS  Google Scholar 

  32. Meyer, F. et al. The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 9, 386 (2008).

    Article  CAS  Google Scholar 

  33. McHardy, A.C. & Rigoutsos, I. What's in the mix: phylogenetic classification of metagenome sequence samples. Curr. Opin. Microbiol. 10, 499–503 (2007).

    Article  CAS  Google Scholar 

  34. Haas, B.J. et al. Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. Genome Res. 21, 494–504 (2011).

    Article  CAS  Google Scholar 

  35. Patel, P.V. et al. Analysis of membrane proteins in metagenomics: networks of correlated environmental features and protein families. Genome Res. 20, 960–971 (2010).

    Article  CAS  Google Scholar 

  36. Parks, D.H. & Beiko, R.G. Identifying biologically relevant differences between metagenomic communities. Bioinformatics 26, 715–721 (2010).

    Article  CAS  Google Scholar 

  37. Zuniga, M. et al. Horizontal gene transfer in the molecular evolution of mannose PTS transporters. Mol. Biol. Evol. 22, 1673–1685 (2005).

    Article  CAS  Google Scholar 

  38. Daniluk, T. et al. Aerobic and anaerobic bacteria in subgingival and supragingival plaques of adult patients with periodontal disease. Adv. Med. Sci. 51 (suppl. 1), 81–85 (2006).

    PubMed  Google Scholar 

  39. Segata, N. et al. Composition of the adult digestive tract bacterial microbiome based on seven mouth surfaces, tonsils, throat and stool samples. Genome Biol. 13, R42 (2012).

    Article  CAS  Google Scholar 

  40. Knowlton, N. & Jackson, J.B. Shifting baselines, local impacts, and global change on coral reefs. PLoS Biol. 6, e54 (2008).

    Article  Google Scholar 

  41. Smith, J.E. et al. Indirect effects of algae on coral: algae-mediated, microbe-induced coral mortality. Ecol. Lett. 9, 835–845 (2006).

    Article  Google Scholar 

  42. Rasher, D.B., Stout, E.P., Engel, S., Kubanek, J. & Hay, M.E. Macroalgal terpenes function as allelopathic agents against reef corals. Proc. Natl. Acad. Sci. USA 108, 17726–17731 (2011).

    Article  CAS  Google Scholar 

  43. Gajer, P. et al. Temporal dynamics of the human vaginal microbiota. Sci. Transl. Med. 4, 132ra52 (2012).

    Article  Google Scholar 

  44. Costello, E.K. et al. Bacterial community variation in human body habitats across space and time. Science 326, 1694–1697 (2009).

    Article  CAS  Google Scholar 

  45. McDonald, D. et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 6, 610–618 (2012).

    Article  CAS  Google Scholar 

  46. Csuros, M. Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood. Bioinformatics 26, 1910–1912 (2010).

    Article  Google Scholar 

  47. Paradis, E., Claude, J. & Strimmer, K. APE: Analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004).

    Article  CAS  Google Scholar 

  48. Edgar, R.C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

    Article  CAS  Google Scholar 

  49. Federhen, S. The NCBI Taxonomy database. Nucleic Acids Res. 40, D136–D143 (2012).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We would like to thank A. Robbins-Pianka and N. Segata, along with all members of the Knight, Beiko, Vega Thurber, Caporaso and Huttenhower laboratories, for their assistance during PICRUSt conception and development. This work was supported in part by the Canadian Institutes of Health Research (M.G.I.L., R.G.B.), the Canada Research Chairs program (R.G.B.), US National Science Foundation (NSF) OCE #1130786 (R.V.T., D.B.), the Howard Hughes Medical Institute (R.K.), US National Institutes of Health (NIH) P01DK078669, U01HG004866, R01HG004872 (R.K.), the Crohn's and Colitis Foundation of America (R.K.), the Sloan Foundation (R.K.), NIH 1R01HG005969 (C.H.), NSF CAREER DBI-1053486 (C.H.) and ARO W911NF-11-1-0473 (C.H.).

Author information

Authors and Affiliations

Authors

Contributions

The teams of M.G.I.L. and R.G.B.; J.A.R. and C.H.; and J.Z., D.K. and R.K. each conceived versions of the gene content prediction algorithm and implemented prototype software. J.Z., M.G.I.L., J.G.C., D.M., D.K., J.C.C., R.K., R.G.B. and C.H. designed the final PICRUSt algorithm and software. J.Z., M.G.I.L., J.G.C. and D.M. wrote the PICRUSt software package. M.G.I.L., J.G.C., D.M. and J.C.C. generated precalculated PICRUSt gene content predictions. D.M. and J.G.C. added functionality to the BIOM software package and the Greengenes resource in support of PICRUSt. M.G.I.L., J.Z., J.G.C., D.M., D.K., J.C.C., J.A.R., R.K., R.G.B. and C.H. applied PICRUSt to control datasets and analyzed the benchmarking data. M.G.I.L., J.Z., J.G.C., D.M., R.K., R.G.B. and C.H. wrote the manuscript. D.E.B. and R.L.V.T. collected and analyzed coral-algal data. All authors edited the manuscript.

Corresponding author

Correspondence to Curtis Huttenhower.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Results and Supplementary Figures 1–17 (PDF 2107 kb)

Supplementary Data (ZIP 95469 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Langille, M., Zaneveld, J., Caporaso, J. et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol 31, 814–821 (2013). https://doi.org/10.1038/nbt.2676

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.2676

Search

Quick links

Nature Briefing Microbiology

Sign up for the Nature Briefing: Microbiology newsletter — what matters in microbiology research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Microbiology