TY - JOUR
T1 - Delineating slowly and rapidly evolving fractions of the Drosophila genome
AU - Keith, Jonathan
AU - Adams, Peter
AU - Stephen, Stuart
AU - Mattick, John
PY - 2008
Y1 - 2008
N2 - Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted similar to 62-66 of the D. melanogaster genome. Almost all (> 90 ) of the aligned protein-coding sequence is in this fraction, suggesting much of it ( comprising the majority of the Drosophila genome, including similar to 56 of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6 to 4.8 . This fraction is also enriched for protein-coding sequence ( while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification
AB - Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted similar to 62-66 of the D. melanogaster genome. Almost all (> 90 ) of the aligned protein-coding sequence is in this fraction, suggesting much of it ( comprising the majority of the Drosophila genome, including similar to 56 of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6 to 4.8 . This fraction is also enriched for protein-coding sequence ( while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification
UR - http://www.liebertonline.com/doi/abs/10.1089/cmb.2007.0173
U2 - 10.1089/cmb.2007.0173
DO - 10.1089/cmb.2007.0173
M3 - Article
SN - 1066-5277
VL - 15
SP - 407
EP - 430
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 4
ER -