Evaluation of microRNA alignment techniques

Mark Ziemann, Antony Kaspi, Assam El-Osta

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Genomic alignment of small RNA (smRNA) sequences such as microRNAs poses considerable challenges due to their short length (∼21 nucleotides [nt]) as well as the large size and complexity of plant and animal genomes. While several tools have been developed for high-throughput mapping of longer mRNA-seq reads (>30 nt), there are few that are specifically designed for mapping of smRNA reads including microRNAs. The accuracy of these mappers has not been systematically determined in the case of smRNA-seq. In addition, it is unknown whether these aligners accurately map smRNA reads containing sequence errors and polymorphisms. By using simulated read sets, we determine the alignment sensitivity and accuracy of 16 short-read mappers and quantify their robustness to mismatches, indels, and nontemplated nucleotide additions. These were explored in the context of a plant genome (Oryza sativa, ∼500 Mbp) and a mammalian genome (Homo sapiens, ∼3.1 Gbp). Analysis of simulated and real smRNA-seq data demonstrates that mapper selection impacts differential expression results and interpretation. These results will inform on best practice for smRNA mapping and enable more accurate smRNA detection and quantification of expression and RNA editing.

Original languageEnglish
Pages (from-to)1120-1138
Number of pages19
JournalRna-A Publication of the Rna Society
Volume22
Issue number8
DOIs
Publication statusPublished - 1 Aug 2016
Externally publishedYes

Keywords

  • Gene expression
  • MicroRNA
  • Next-generation sequencing
  • Short-read aligners
  • Small RNA sequencing

Cite this

Ziemann, Mark ; Kaspi, Antony ; El-Osta, Assam. / Evaluation of microRNA alignment techniques. In: Rna-A Publication of the Rna Society. 2016 ; Vol. 22, No. 8. pp. 1120-1138.
@article{f67fe3a5b8d949d4ab3354b4961af119,
title = "Evaluation of microRNA alignment techniques",
abstract = "Genomic alignment of small RNA (smRNA) sequences such as microRNAs poses considerable challenges due to their short length (∼21 nucleotides [nt]) as well as the large size and complexity of plant and animal genomes. While several tools have been developed for high-throughput mapping of longer mRNA-seq reads (>30 nt), there are few that are specifically designed for mapping of smRNA reads including microRNAs. The accuracy of these mappers has not been systematically determined in the case of smRNA-seq. In addition, it is unknown whether these aligners accurately map smRNA reads containing sequence errors and polymorphisms. By using simulated read sets, we determine the alignment sensitivity and accuracy of 16 short-read mappers and quantify their robustness to mismatches, indels, and nontemplated nucleotide additions. These were explored in the context of a plant genome (Oryza sativa, ∼500 Mbp) and a mammalian genome (Homo sapiens, ∼3.1 Gbp). Analysis of simulated and real smRNA-seq data demonstrates that mapper selection impacts differential expression results and interpretation. These results will inform on best practice for smRNA mapping and enable more accurate smRNA detection and quantification of expression and RNA editing.",
keywords = "Gene expression, MicroRNA, Next-generation sequencing, Short-read aligners, Small RNA sequencing",
author = "Mark Ziemann and Antony Kaspi and Assam El-Osta",
year = "2016",
month = "8",
day = "1",
doi = "10.1261/rna.055509.115",
language = "English",
volume = "22",
pages = "1120--1138",
journal = "Rna-A Publication of the Rna Society",
issn = "1355-8382",
publisher = "Cold Spring Harbor Laboratory Press",
number = "8",

}

Evaluation of microRNA alignment techniques. / Ziemann, Mark; Kaspi, Antony; El-Osta, Assam.

In: Rna-A Publication of the Rna Society, Vol. 22, No. 8, 01.08.2016, p. 1120-1138.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Evaluation of microRNA alignment techniques

AU - Ziemann, Mark

AU - Kaspi, Antony

AU - El-Osta, Assam

PY - 2016/8/1

Y1 - 2016/8/1

N2 - Genomic alignment of small RNA (smRNA) sequences such as microRNAs poses considerable challenges due to their short length (∼21 nucleotides [nt]) as well as the large size and complexity of plant and animal genomes. While several tools have been developed for high-throughput mapping of longer mRNA-seq reads (>30 nt), there are few that are specifically designed for mapping of smRNA reads including microRNAs. The accuracy of these mappers has not been systematically determined in the case of smRNA-seq. In addition, it is unknown whether these aligners accurately map smRNA reads containing sequence errors and polymorphisms. By using simulated read sets, we determine the alignment sensitivity and accuracy of 16 short-read mappers and quantify their robustness to mismatches, indels, and nontemplated nucleotide additions. These were explored in the context of a plant genome (Oryza sativa, ∼500 Mbp) and a mammalian genome (Homo sapiens, ∼3.1 Gbp). Analysis of simulated and real smRNA-seq data demonstrates that mapper selection impacts differential expression results and interpretation. These results will inform on best practice for smRNA mapping and enable more accurate smRNA detection and quantification of expression and RNA editing.

AB - Genomic alignment of small RNA (smRNA) sequences such as microRNAs poses considerable challenges due to their short length (∼21 nucleotides [nt]) as well as the large size and complexity of plant and animal genomes. While several tools have been developed for high-throughput mapping of longer mRNA-seq reads (>30 nt), there are few that are specifically designed for mapping of smRNA reads including microRNAs. The accuracy of these mappers has not been systematically determined in the case of smRNA-seq. In addition, it is unknown whether these aligners accurately map smRNA reads containing sequence errors and polymorphisms. By using simulated read sets, we determine the alignment sensitivity and accuracy of 16 short-read mappers and quantify their robustness to mismatches, indels, and nontemplated nucleotide additions. These were explored in the context of a plant genome (Oryza sativa, ∼500 Mbp) and a mammalian genome (Homo sapiens, ∼3.1 Gbp). Analysis of simulated and real smRNA-seq data demonstrates that mapper selection impacts differential expression results and interpretation. These results will inform on best practice for smRNA mapping and enable more accurate smRNA detection and quantification of expression and RNA editing.

KW - Gene expression

KW - MicroRNA

KW - Next-generation sequencing

KW - Short-read aligners

KW - Small RNA sequencing

UR - http://www.scopus.com/inward/record.url?scp=84979648921&partnerID=8YFLogxK

U2 - 10.1261/rna.055509.115

DO - 10.1261/rna.055509.115

M3 - Article

VL - 22

SP - 1120

EP - 1138

JO - Rna-A Publication of the Rna Society

JF - Rna-A Publication of the Rna Society

SN - 1355-8382

IS - 8

ER -