TY - JOUR
T1 - Developing a linked electronic health record derived data platform to support research into healthy ageing
AU - Andrew, Nadine E.
AU - Beare, Richard
AU - Ravipati, Tanya
AU - Parker, Emily
AU - Snowdon, David
AU - Naude, Kim
AU - Srikanth, Velandai
N1 - Funding Information:
In 2019, The National Centre for Health Ageing (NCHA) was funded by the Australian Federal Government to develop an EHR derived data platform. The overarching purpose being to make better use of existing health data to support transformative research and translation in health service and aged care innovation, and gain insights into the epidemiology of ageing. The aim of this paper is to provide an overview of the activities undertaken to establish the foundations of the NCHA Healthy Ageing Data Platform.
Funding Information:
We wish to acknowledge the members of the National Centre for Healthy Ageing Data Platform Working Group and Data Access Working Group. The National Centre for Healthy Ageing acknowledges the financial support of the Australian Government Department of Health to support the Centre establishment.
Publisher Copyright:
© The Authors.
PY - 2023
Y1 - 2023
N2 - Introduction Digitalisation of Electronic Health Record (EHR) data has created unique opportunities for research. However, these data are routinely collected for operational purposes and so are not curated to the standard required for research. Harnessing such routine data at large scale allows efficient and long-term epidemiological and health services research. Objectives To describe the establishment a linked EHR derived data platform in the National Centre for Healthy Ageing, Melbourne, Australia, aimed at enabling research targeting national health priority areas in ageing. Methods Our approach incorporated: data validation, curation and warehousing to ensure quality and completeness; end-user engagement and consensus on the platform content; implementation of an artificial intelligence (AI) pipeline for extraction of text-based data items; early consumer involvement; and implementation of routine collection of patient reported outcome measures, in a multisite public health service. Results Data for a cohort of >800,000 patients collected over a 10-year period have been curated within the platform’s research data warehouse. So far 117 items have been identified as suitable for inclusion, from 11 research relevant datasets held within the health service EHR systems. Data access, extraction and release processes, guided by the Five Safes Framework, are being tested through project use-cases. A natural language processing (NLP) pipeline has been implemented and a framework for the routine collection and incorporation of patient reported outcome measures developed. Conclusions We highlight the importance of establishing comprehensive processes for the foundations of a data platform utilising routine data not collected for research purposes. These robust foundations will facilitate future expansion through linkages to other datasets for the efficient and cost-effective study of health related to ageing at a large scale.
AB - Introduction Digitalisation of Electronic Health Record (EHR) data has created unique opportunities for research. However, these data are routinely collected for operational purposes and so are not curated to the standard required for research. Harnessing such routine data at large scale allows efficient and long-term epidemiological and health services research. Objectives To describe the establishment a linked EHR derived data platform in the National Centre for Healthy Ageing, Melbourne, Australia, aimed at enabling research targeting national health priority areas in ageing. Methods Our approach incorporated: data validation, curation and warehousing to ensure quality and completeness; end-user engagement and consensus on the platform content; implementation of an artificial intelligence (AI) pipeline for extraction of text-based data items; early consumer involvement; and implementation of routine collection of patient reported outcome measures, in a multisite public health service. Results Data for a cohort of >800,000 patients collected over a 10-year period have been curated within the platform’s research data warehouse. So far 117 items have been identified as suitable for inclusion, from 11 research relevant datasets held within the health service EHR systems. Data access, extraction and release processes, guided by the Five Safes Framework, are being tested through project use-cases. A natural language processing (NLP) pipeline has been implemented and a framework for the routine collection and incorporation of patient reported outcome measures developed. Conclusions We highlight the importance of establishing comprehensive processes for the foundations of a data platform utilising routine data not collected for research purposes. These robust foundations will facilitate future expansion through linkages to other datasets for the efficient and cost-effective study of health related to ageing at a large scale.
KW - ageing
KW - big data
KW - data linkage
KW - electronic health record
KW - longitudinal cohort
UR - http://www.scopus.com/inward/record.url?scp=85163624076&partnerID=8YFLogxK
U2 - 10.23889/ijpds.v8i1.2129
DO - 10.23889/ijpds.v8i1.2129
M3 - Article
AN - SCOPUS:85163624076
SN - 2399-4908
VL - 8
JO - International Journal of Population Data Science
JF - International Journal of Population Data Science
IS - 1
M1 - 13
ER -