plyranges: a grammar of genomic data transformation

Stuart Lee, Dianne Cook, Michael Lawrence

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Bioconductor is a widely used R-based platform for genomics, but its host of complex genomic data structures places a cognitive burden on the user. For most tasks, the GRanges object would suffice, but there are gaps in the API that prevent its general use. By recognizing that the GRanges class follows "tidy" data principles, we create a grammar of genomic data transformation, defining verbs for performing actions on and between genomic interval data and providing a way of performing common data analysis tasks through a coherent interface to existing Bioconductor infrastructure. We implement this grammar as a Bioconductor/R package called plyranges.

Original languageEnglish
Article number4
Number of pages10
JournalGenome Biology
Volume20
DOIs
Publication statusPublished - 4 Jan 2019

Keywords

  • Bioconductor
  • Data analysis
  • Genomes
  • Grammar

Cite this

Lee, Stuart ; Cook, Dianne ; Lawrence, Michael. / plyranges : a grammar of genomic data transformation. In: Genome Biology. 2019 ; Vol. 20.
@article{1349ecf740484e17980439b1b95f6b6c,
title = "plyranges: a grammar of genomic data transformation",
abstract = "Bioconductor is a widely used R-based platform for genomics, but its host of complex genomic data structures places a cognitive burden on the user. For most tasks, the GRanges object would suffice, but there are gaps in the API that prevent its general use. By recognizing that the GRanges class follows {"}tidy{"} data principles, we create a grammar of genomic data transformation, defining verbs for performing actions on and between genomic interval data and providing a way of performing common data analysis tasks through a coherent interface to existing Bioconductor infrastructure. We implement this grammar as a Bioconductor/R package called plyranges.",
keywords = "Bioconductor, Data analysis, Genomes, Grammar",
author = "Stuart Lee and Dianne Cook and Michael Lawrence",
year = "2019",
month = "1",
day = "4",
doi = "10.1186/s13059-018-1597-8",
language = "English",
volume = "20",
journal = "Genome Biology",
issn = "1474-760X",
publisher = "BioMed Central",

}

plyranges : a grammar of genomic data transformation. / Lee, Stuart; Cook, Dianne; Lawrence, Michael.

In: Genome Biology, Vol. 20, 4, 04.01.2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - plyranges

T2 - a grammar of genomic data transformation

AU - Lee, Stuart

AU - Cook, Dianne

AU - Lawrence, Michael

PY - 2019/1/4

Y1 - 2019/1/4

N2 - Bioconductor is a widely used R-based platform for genomics, but its host of complex genomic data structures places a cognitive burden on the user. For most tasks, the GRanges object would suffice, but there are gaps in the API that prevent its general use. By recognizing that the GRanges class follows "tidy" data principles, we create a grammar of genomic data transformation, defining verbs for performing actions on and between genomic interval data and providing a way of performing common data analysis tasks through a coherent interface to existing Bioconductor infrastructure. We implement this grammar as a Bioconductor/R package called plyranges.

AB - Bioconductor is a widely used R-based platform for genomics, but its host of complex genomic data structures places a cognitive burden on the user. For most tasks, the GRanges object would suffice, but there are gaps in the API that prevent its general use. By recognizing that the GRanges class follows "tidy" data principles, we create a grammar of genomic data transformation, defining verbs for performing actions on and between genomic interval data and providing a way of performing common data analysis tasks through a coherent interface to existing Bioconductor infrastructure. We implement this grammar as a Bioconductor/R package called plyranges.

KW - Bioconductor

KW - Data analysis

KW - Genomes

KW - Grammar

UR - http://www.scopus.com/inward/record.url?scp=85059494692&partnerID=8YFLogxK

U2 - 10.1186/s13059-018-1597-8

DO - 10.1186/s13059-018-1597-8

M3 - Article

VL - 20

JO - Genome Biology

JF - Genome Biology

SN - 1474-760X

M1 - 4

ER -