Sequence segmentation

Research output: Chapter in Book/Report/Conference proceedingChapter (Book)Researchpeer-review

6 Citations (Scopus)

Abstract

Whole-genome comparisons among mammalian and other eukaryotic organisms have revealed that they contain large quantities of conserved non-protein-coding sequence. Although some of the functions of this non-coding DNA have been identified, there remains a large quantity of conserved genomic sequence that is of no known function. Moreover, the task of delineating the conserved sequences is non-trivial, particularly when some sequences are conserved in only a small number of lineages. Sequence segmentation is a statistical technique for identifying putative functional elements in genomes based on atypical sequence characteristics, such as conservation levels relative to other genomes, GC content, SNP frequency, and potentially many others. The publicly available program changept and associated programs use Bayesian multiple change-point analysis to delineate classes of genomic segments with similar characteristics, potentially representing new classes of non-coding RNAs (contact web site: http://silmaril.math.sci.qut.edu.au/~keith/ ).

Original languageEnglish
Title of host publicationBioinformatics
Subtitle of host publicationData, Sequence Analysis and Evolution
PublisherHumana Press
Pages207-229
Number of pages23
ISBN (Print)9781588297075
DOIs
Publication statusPublished - 1 Jan 2008
Externally publishedYes

Publication series

NameMethods in Molecular Biology
Volume452
ISSN (Print)1064-3745

Keywords

  • Bayesian modeling
  • Change-points
  • Comparative genomics
  • Conservation
  • Markov chain Monte Carlo
  • Non-coding RNAs
  • Segmentation
  • Sliding window analysis

Cite this