Algorithms for sequence analysis via mutagenesis

Jonathan M. Keith, Peter Adams, Darryn Bryant, Duncan A E Cochran, Gita H. Lala, Keith R. Mitchelson

Research output: Contribution to journalArticleResearchpeer-review

7 Citations (Scopus)

Abstract

Motivation: Despite many successes of conventional DNA sequencing methods, some DNAs remain difficult or impossible to sequence. Unsequenceable regions occur in the genomes of many biologically important organisms, including the human genome. Such regions range in length from tens to millions of bases, and may contain valuable information such as the sequences of important genes. The authors have recently developed a technique that renders a wide range of problematic DNAs amenable to sequencing. The technique is known as sequence analysis via mutagenesis (SAM). This paper presents a number of algorithms for analysing and interpreting data generated by this technique. Results: The essential idea of SAM is to infer the target sequence using the sequences of mutants derived from the target. We describe three algorithms used in this process. The first algorithm predicts the number of mutants that will be required to infer the target sequence with a desired level of accuracy. The second algorithm infers the target sequence itself, using the mutant sequences. The third algorithm assigns quality values to each inferred base. The algorithms are illustrated using mutant sequences generated in the laboratory.

Original languageEnglish
Pages (from-to)2401-2410
Number of pages10
JournalBioinformatics
Volume20
Issue number15
DOIs
Publication statusPublished - 12 Oct 2004

Cite this