A rescue strategy for multimapping short sequence tags refines surveys of transcriptional activity by CAGE

Geoffrey J Faulkner, Alistair R R Forrest, Alistair M Chalk, Kate Schroder, Yoshihide Hayashizaki, Piero Carninci, David A Hume, Sean M Grimmond

Research output: Contribution to journalArticleResearchpeer-review

75 Citations (Scopus)

Abstract

Cap analysis gene expression (CAGE) is a high-throughput, tag-based method designed to survey the 5 end of capped full-length cDNAs. CAGE has previously been used to define global transcription start site usage and monitor gene activity in mammals. A drawback of the CAGE approach thus far has been the removal of as many as 40 of CAGE sequence tags due to their mapping to multiple genomic locations. Here, we address the origins of multimap tags and present a novel strategy to assign CAGE tags to their most likely source promoter region. When this approach was applied to the FANTOM3 CAGE libraries, the percentage of protein-coding mouse transcriptional frameworks detected by CAGE improved from 42.9 to 57.8 (an increase of 5516 frameworks) with no reduction in CAGE to microarray correlation. These results suggest that the multimap tags produced by high-throughput, short sequence tag-based approaches can be rescued to augment greatly the transcriptome coverage provided by single-map tags alone.
Original languageEnglish
Pages (from-to)281 - 288
Number of pages8
JournalGenomics
Volume91
Issue number3
Publication statusPublished - 2008
Externally publishedYes

Cite this