plyranges: a grammar of genomic data transformation

Stuart Lee, Dianne Cook, Michael Lawrence

Research output: Contribution to journalArticleResearchpeer-review

54 Citations (Scopus)

Abstract

Bioconductor is a widely used R-based platform for genomics, but its host of complex genomic data structures places a cognitive burden on the user. For most tasks, the GRanges object would suffice, but there are gaps in the API that prevent its general use. By recognizing that the GRanges class follows "tidy" data principles, we create a grammar of genomic data transformation, defining verbs for performing actions on and between genomic interval data and providing a way of performing common data analysis tasks through a coherent interface to existing Bioconductor infrastructure. We implement this grammar as a Bioconductor/R package called plyranges.

Original languageEnglish
Article number4
Number of pages10
JournalGenome Biology
Volume20
DOIs
Publication statusPublished - 4 Jan 2019

Keywords

  • Bioconductor
  • Data analysis
  • Genomes
  • Grammar

Cite this