TY - JOUR
T1 - Community-wide analysis of microbial genome sequence signatures
AU - Dick, Gregory J.
AU - Andersson, Anders F.
AU - Baker, Brett J.
AU - Simmons, Sheri L.
AU - Thomas, Brian C.
AU - Yelton, A. Pepper
AU - Banfield, Jillian F.
N1 - Funding Information:
We thank Ms D Aliaga Goltsman, Dr V Denef, Ms C Sun, Dr R Hettich, Dr N VerBerkmoes, and Mr M Shah for data and bioinformatic assistance. We are grateful to Mrs M Kelly for guiding kernel density analysis, to Mr Rudy Carver for sampling assistance and Mr TW Arman, President, Iron Mountain Mines, and Dr R Sugarek for site access. The manuscript was significantly improved thanks to critical revisions from Mr D Soergel and Dr S Brenner and three anonymous reviewers. This work was supported by DOE Genomics:GTL project Grant No. DE-FG02-05ER64134 (Office of Science) and sequencing was done at the DOE Joint Genome Institute. AFA was supported by grants from the Swedish Research Council and Carl Try-ggers Foundation.
PY - 2009/8/21
Y1 - 2009/8/21
N2 - Background: Analyses of DNA sequences from cultivated microorganisms have revealed genome-wide, taxa-specific nucleotide compositional characteristics, referred to as genome signatures. These signatures have far-reaching implications for understanding genome evolution and potential application in classification of metagenomic sequence fragments. However, little is known regarding the distribution of genome signatures in natural microbial communities or the extent to which environmental factors shape them.Results: We analyzed metagenomic sequence data from two acidophilic biofilm communities, including composite genomes reconstructed for nine archaea, three bacteria, and numerous associated viruses, as well as thousands of unassigned fragments from strain variants and low-abundance organisms. Genome signatures, in the form of tetranucleotide frequencies analyzed by emergent self-organizing maps, segregated sequences from all known populations sharing < 50 to 60% average amino acid identity and revealed previously unknown genomic clusters corresponding to low-abundance organisms and a putative plasmid. Signatures were pervasive genome-wide. Clusters were resolved because intra-genome differences resulting from translational selection or protein adaptation to the intracellular (pH ~5) versus extracellular (pH ~1) environment were small relative to inter-genome differences. We found that these genome signatures stem from multiple influences but are primarily manifested through codon composition, which we propose is the result of genome-specific mutational biases.Conclusions: An important conclusion is that shared environmental pressures and interactions among coevolving organisms do not obscure genome signatures in acid mine drainage communities. Thus, genome signatures can be used to assign sequence fragments to populations, an essential prerequisite if metagenomics is to provide ecological and biochemical insights into the functioning of microbial communities.
AB - Background: Analyses of DNA sequences from cultivated microorganisms have revealed genome-wide, taxa-specific nucleotide compositional characteristics, referred to as genome signatures. These signatures have far-reaching implications for understanding genome evolution and potential application in classification of metagenomic sequence fragments. However, little is known regarding the distribution of genome signatures in natural microbial communities or the extent to which environmental factors shape them.Results: We analyzed metagenomic sequence data from two acidophilic biofilm communities, including composite genomes reconstructed for nine archaea, three bacteria, and numerous associated viruses, as well as thousands of unassigned fragments from strain variants and low-abundance organisms. Genome signatures, in the form of tetranucleotide frequencies analyzed by emergent self-organizing maps, segregated sequences from all known populations sharing < 50 to 60% average amino acid identity and revealed previously unknown genomic clusters corresponding to low-abundance organisms and a putative plasmid. Signatures were pervasive genome-wide. Clusters were resolved because intra-genome differences resulting from translational selection or protein adaptation to the intracellular (pH ~5) versus extracellular (pH ~1) environment were small relative to inter-genome differences. We found that these genome signatures stem from multiple influences but are primarily manifested through codon composition, which we propose is the result of genome-specific mutational biases.Conclusions: An important conclusion is that shared environmental pressures and interactions among coevolving organisms do not obscure genome signatures in acid mine drainage communities. Thus, genome signatures can be used to assign sequence fragments to populations, an essential prerequisite if metagenomics is to provide ecological and biochemical insights into the functioning of microbial communities.
UR - http://www.scopus.com/inward/record.url?scp=70350015324&partnerID=8YFLogxK
U2 - 10.1186/gb-2009-10-8-r85
DO - 10.1186/gb-2009-10-8-r85
M3 - Article
C2 - 19698104
AN - SCOPUS:70350015324
SN - 1474-7596
VL - 10
JO - Genome Biology
JF - Genome Biology
IS - 8
M1 - R85
ER -