TY - JOUR
T1 - Rapid hybrid de novo assembly of a microbial genome using only short reads
T2 - Corynebacterium pseudotuberculosis I19 as a case study
AU - Cerdeira, Louise Teixeira
AU - Carneiro, Adriana Ribeiro
AU - Ramos, Rommel Thiago Jucá
AU - de Almeida, Sintia Silva
AU - D́afonseca, Vivian
AU - Schneider, Maria Paula Cruz
AU - Baumbach, Jan
AU - Tauch, Andreas
AU - McCulloch, John Anthony
AU - Azevedo, Vasco Ariston Carvalho
AU - Silva, Artur
PY - 2011/8/1
Y1 - 2011/8/1
N2 - Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de novo strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain I19, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de novo strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former.
AB - Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de novo strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain I19, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de novo strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former.
KW - Assembly
KW - Corynebacterium
KW - De novo
KW - Next generation sequencing
KW - Short read
KW - SOLiD
UR - http://www.scopus.com/inward/record.url?scp=79959723260&partnerID=8YFLogxK
U2 - 10.1016/j.mimet.2011.05.008
DO - 10.1016/j.mimet.2011.05.008
M3 - Article
C2 - 21620904
AN - SCOPUS:79959723260
SN - 0167-7012
VL - 86
SP - 218
EP - 223
JO - Journal of Microbiological Methods
JF - Journal of Microbiological Methods
IS - 2
ER -