Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study

Louise Teixeira Cerdeira, Adriana Ribeiro Carneiro, Rommel Thiago Jucá Ramos, Sintia Silva de Almeida, Vivian D́afonseca, Maria Paula Cruz Schneider, Jan Baumbach, Andreas Tauch, John Anthony McCulloch, Vasco Ariston Carvalho Azevedo, Artur Silva

Research output: Contribution to journalArticleResearchpeer-review

41 Citations (Scopus)

Abstract

Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de novo strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain I19, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de novo strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former.

Original languageEnglish
Pages (from-to)218-223
Number of pages6
JournalJournal of Microbiological Methods
Volume86
Issue number2
DOIs
Publication statusPublished - 1 Aug 2011
Externally publishedYes

Keywords

  • Assembly
  • Corynebacterium
  • De novo
  • Next generation sequencing
  • Short read
  • SOLiD

Cite this