Best practice data life cycle approaches for the life sciences

Philippa C. Griffin, Jyoti Khadake, Kate S. LeMay, Suzanna E. Lewis, Sandra Orchard, Andrew Pask, Bernard Pope, Ute Roessner, Keith Russell, Torsten Seemann, Andrew Treloar, Sonika Tyagi, Jeffrey H. Christiansen, Saravanan Dayalan, Simon Gladman, Sandra B. Hangartner, Helen L. Hayden, William W.H. Ho, Gabriel Keeble-Gagnère, Pasi K. Korhonen & 7 others Peter Neish, Priscilla R. Prestes, Mark F. Richardson, Nathan S. Watson-Haigh, Kelly L. Wyres, Neil D. Young, Maria Victoria Schneider

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Throughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a 'life cycle' view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations and regulations, without necessarily being able to access data management training or possessing a clear understanding of practical approaches that can assist in data management in their particular research domain. Here we provide an overview of best practice data life cycle approaches for researchers in the life sciences/bioinformatics space with a particular focus on 'omics' datasets and computer-based data processing and analysis. We discuss the different stages of the data life cycle and provide practical suggestions for useful tools and resources to improve data management practices.

Original languageEnglish
Article number1618
Number of pages22
JournalF1000Research
Volume6
DOIs
Publication statusPublished - 4 Jun 2018

Keywords

  • Bioinformatics
  • Data management
  • Data sharing
  • Open science
  • Reproducibility

Cite this

Griffin, P. C., Khadake, J., LeMay, K. S., Lewis, S. E., Orchard, S., Pask, A., ... Schneider, M. V. (2018). Best practice data life cycle approaches for the life sciences. F1000Research, 6, [1618]. https://doi.org/10.12688/f1000research.12344.2
Griffin, Philippa C. ; Khadake, Jyoti ; LeMay, Kate S. ; Lewis, Suzanna E. ; Orchard, Sandra ; Pask, Andrew ; Pope, Bernard ; Roessner, Ute ; Russell, Keith ; Seemann, Torsten ; Treloar, Andrew ; Tyagi, Sonika ; Christiansen, Jeffrey H. ; Dayalan, Saravanan ; Gladman, Simon ; Hangartner, Sandra B. ; Hayden, Helen L. ; Ho, William W.H. ; Keeble-Gagnère, Gabriel ; Korhonen, Pasi K. ; Neish, Peter ; Prestes, Priscilla R. ; Richardson, Mark F. ; Watson-Haigh, Nathan S. ; Wyres, Kelly L. ; Young, Neil D. ; Schneider, Maria Victoria. / Best practice data life cycle approaches for the life sciences. In: F1000Research. 2018 ; Vol. 6.
@article{d875188346504e308886bff614e63224,
title = "Best practice data life cycle approaches for the life sciences",
abstract = "Throughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a 'life cycle' view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations and regulations, without necessarily being able to access data management training or possessing a clear understanding of practical approaches that can assist in data management in their particular research domain. Here we provide an overview of best practice data life cycle approaches for researchers in the life sciences/bioinformatics space with a particular focus on 'omics' datasets and computer-based data processing and analysis. We discuss the different stages of the data life cycle and provide practical suggestions for useful tools and resources to improve data management practices.",
keywords = "Bioinformatics, Data management, Data sharing, Open science, Reproducibility",
author = "Griffin, {Philippa C.} and Jyoti Khadake and LeMay, {Kate S.} and Lewis, {Suzanna E.} and Sandra Orchard and Andrew Pask and Bernard Pope and Ute Roessner and Keith Russell and Torsten Seemann and Andrew Treloar and Sonika Tyagi and Christiansen, {Jeffrey H.} and Saravanan Dayalan and Simon Gladman and Hangartner, {Sandra B.} and Hayden, {Helen L.} and Ho, {William W.H.} and Gabriel Keeble-Gagn{\`e}re and Korhonen, {Pasi K.} and Peter Neish and Prestes, {Priscilla R.} and Richardson, {Mark F.} and Watson-Haigh, {Nathan S.} and Wyres, {Kelly L.} and Young, {Neil D.} and Schneider, {Maria Victoria}",
year = "2018",
month = "6",
day = "4",
doi = "10.12688/f1000research.12344.2",
language = "English",
volume = "6",
journal = "F1000Research",
issn = "2046-1402",
publisher = "Faculty of 1000 Ltd",

}

Griffin, PC, Khadake, J, LeMay, KS, Lewis, SE, Orchard, S, Pask, A, Pope, B, Roessner, U, Russell, K, Seemann, T, Treloar, A, Tyagi, S, Christiansen, JH, Dayalan, S, Gladman, S, Hangartner, SB, Hayden, HL, Ho, WWH, Keeble-Gagnère, G, Korhonen, PK, Neish, P, Prestes, PR, Richardson, MF, Watson-Haigh, NS, Wyres, KL, Young, ND & Schneider, MV 2018, 'Best practice data life cycle approaches for the life sciences', F1000Research, vol. 6, 1618. https://doi.org/10.12688/f1000research.12344.2

Best practice data life cycle approaches for the life sciences. / Griffin, Philippa C.; Khadake, Jyoti; LeMay, Kate S.; Lewis, Suzanna E.; Orchard, Sandra; Pask, Andrew; Pope, Bernard; Roessner, Ute; Russell, Keith; Seemann, Torsten; Treloar, Andrew; Tyagi, Sonika; Christiansen, Jeffrey H.; Dayalan, Saravanan; Gladman, Simon; Hangartner, Sandra B.; Hayden, Helen L.; Ho, William W.H.; Keeble-Gagnère, Gabriel; Korhonen, Pasi K.; Neish, Peter; Prestes, Priscilla R.; Richardson, Mark F.; Watson-Haigh, Nathan S.; Wyres, Kelly L.; Young, Neil D.; Schneider, Maria Victoria.

In: F1000Research, Vol. 6, 1618, 04.06.2018.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Best practice data life cycle approaches for the life sciences

AU - Griffin, Philippa C.

AU - Khadake, Jyoti

AU - LeMay, Kate S.

AU - Lewis, Suzanna E.

AU - Orchard, Sandra

AU - Pask, Andrew

AU - Pope, Bernard

AU - Roessner, Ute

AU - Russell, Keith

AU - Seemann, Torsten

AU - Treloar, Andrew

AU - Tyagi, Sonika

AU - Christiansen, Jeffrey H.

AU - Dayalan, Saravanan

AU - Gladman, Simon

AU - Hangartner, Sandra B.

AU - Hayden, Helen L.

AU - Ho, William W.H.

AU - Keeble-Gagnère, Gabriel

AU - Korhonen, Pasi K.

AU - Neish, Peter

AU - Prestes, Priscilla R.

AU - Richardson, Mark F.

AU - Watson-Haigh, Nathan S.

AU - Wyres, Kelly L.

AU - Young, Neil D.

AU - Schneider, Maria Victoria

PY - 2018/6/4

Y1 - 2018/6/4

N2 - Throughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a 'life cycle' view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations and regulations, without necessarily being able to access data management training or possessing a clear understanding of practical approaches that can assist in data management in their particular research domain. Here we provide an overview of best practice data life cycle approaches for researchers in the life sciences/bioinformatics space with a particular focus on 'omics' datasets and computer-based data processing and analysis. We discuss the different stages of the data life cycle and provide practical suggestions for useful tools and resources to improve data management practices.

AB - Throughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a 'life cycle' view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations and regulations, without necessarily being able to access data management training or possessing a clear understanding of practical approaches that can assist in data management in their particular research domain. Here we provide an overview of best practice data life cycle approaches for researchers in the life sciences/bioinformatics space with a particular focus on 'omics' datasets and computer-based data processing and analysis. We discuss the different stages of the data life cycle and provide practical suggestions for useful tools and resources to improve data management practices.

KW - Bioinformatics

KW - Data management

KW - Data sharing

KW - Open science

KW - Reproducibility

UR - http://www.scopus.com/inward/record.url?scp=85054930966&partnerID=8YFLogxK

U2 - 10.12688/f1000research.12344.2

DO - 10.12688/f1000research.12344.2

M3 - Article

VL - 6

JO - F1000Research

JF - F1000Research

SN - 2046-1402

M1 - 1618

ER -

Griffin PC, Khadake J, LeMay KS, Lewis SE, Orchard S, Pask A et al. Best practice data life cycle approaches for the life sciences. F1000Research. 2018 Jun 4;6. 1618. https://doi.org/10.12688/f1000research.12344.2