Characterising risk of in-hospital mortality following cardiac arrest using machine learning: A retrospective international registry study

Shane Nanayakkara, Sam Fogarty, Michael Tremeer, Kelvin Ross, Brent Richards, Christoph Bergmeir, Sheng Xu, Dion Stub, Karen Smith, Mark Tacey, Danny Liew, David Pilcher, David M. Kaye

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Background: Resuscitated cardiac arrest is associated with high mortality; however, the ability to estimate risk of adverse outcomes using existing illness severity scores is limited. Using in-hospital data available within the first 24 hours of admission, we aimed to develop more accurate models of risk prediction using both logistic regression (LR) and machine learning (ML) techniques, with a combination of demographic, physiologic, and biochemical information. Methods and findings: Patient-level data were extracted from the Australian and New Zealand Intensive Care Society (ANZICS) Adult Patient Database for patients who had experienced a cardiac arrest within 24 hours prior to admission to an intensive care unit (ICU) during the period January 2006 to December 2016. The primary outcome was in-hospital mortality. The models were trained and tested on a dataset (split 90:10) including age, lowest and highest physiologic variables during the first 24 hours, and key past medical history. LR and 5 ML approaches (gradient boosting machine [GBM], support vector classifier [SVC], random forest [RF], artificial neural network [ANN], and an ensemble) were compared to the APACHE III and Australian and New Zealand Risk of Death (ANZROD) predictions. In all, 39,566 patients from 186 ICUs were analysed. Mean (±SD) age was 61 ± 17 years; 65% were male. Overall in-hospital mortality was 45.5%. Models were evaluated in the test set. The APACHE III and ANZROD scores demonstrated good discrimination (area under the receiver operating characteristic curve [AUROC] = 0.80 [95% CI 0.79–0.82] and 0.81 [95% CI 0.8–0.82], respectively) and modest calibration (Brier score 0.19 for both), which was slightly improved by LR (AUROC = 0.82 [95% CI 0.81–0.83], DeLong test, p < 0.001). Discrimination was significantly improved using ML models (ensemble and GBM AUROCs = 0.87 [95% CI 0.86–0.88], DeLong test, p < 0.001), with an improvement in performance (Brier score reduction of 22%). Explainability models were created to assist in identifying the physiologic features that most contributed to an individual patient’s survival. Key limitations include the absence of pre-hospital data and absence of external validation. Conclusions: ML approaches significantly enhance predictive discrimination for mortality following cardiac arrest compared to existing illness severity scores and LR, without the use of pre-hospital data. The discriminative ability of these ML models requires validation in external cohorts to establish generalisability.

Original languageEnglish
Article numbere1002709
Number of pages16
JournalPLoS Medicine
Volume15
Issue number11
DOIs
Publication statusPublished - 1 Nov 2018

Cite this

@article{9ea7040c579f47099c87768872195303,
title = "Characterising risk of in-hospital mortality following cardiac arrest using machine learning: A retrospective international registry study",
abstract = "Background: Resuscitated cardiac arrest is associated with high mortality; however, the ability to estimate risk of adverse outcomes using existing illness severity scores is limited. Using in-hospital data available within the first 24 hours of admission, we aimed to develop more accurate models of risk prediction using both logistic regression (LR) and machine learning (ML) techniques, with a combination of demographic, physiologic, and biochemical information. Methods and findings: Patient-level data were extracted from the Australian and New Zealand Intensive Care Society (ANZICS) Adult Patient Database for patients who had experienced a cardiac arrest within 24 hours prior to admission to an intensive care unit (ICU) during the period January 2006 to December 2016. The primary outcome was in-hospital mortality. The models were trained and tested on a dataset (split 90:10) including age, lowest and highest physiologic variables during the first 24 hours, and key past medical history. LR and 5 ML approaches (gradient boosting machine [GBM], support vector classifier [SVC], random forest [RF], artificial neural network [ANN], and an ensemble) were compared to the APACHE III and Australian and New Zealand Risk of Death (ANZROD) predictions. In all, 39,566 patients from 186 ICUs were analysed. Mean (±SD) age was 61 ± 17 years; 65{\%} were male. Overall in-hospital mortality was 45.5{\%}. Models were evaluated in the test set. The APACHE III and ANZROD scores demonstrated good discrimination (area under the receiver operating characteristic curve [AUROC] = 0.80 [95{\%} CI 0.79–0.82] and 0.81 [95{\%} CI 0.8–0.82], respectively) and modest calibration (Brier score 0.19 for both), which was slightly improved by LR (AUROC = 0.82 [95{\%} CI 0.81–0.83], DeLong test, p < 0.001). Discrimination was significantly improved using ML models (ensemble and GBM AUROCs = 0.87 [95{\%} CI 0.86–0.88], DeLong test, p < 0.001), with an improvement in performance (Brier score reduction of 22{\%}). Explainability models were created to assist in identifying the physiologic features that most contributed to an individual patient’s survival. Key limitations include the absence of pre-hospital data and absence of external validation. Conclusions: ML approaches significantly enhance predictive discrimination for mortality following cardiac arrest compared to existing illness severity scores and LR, without the use of pre-hospital data. The discriminative ability of these ML models requires validation in external cohorts to establish generalisability.",
author = "Shane Nanayakkara and Sam Fogarty and Michael Tremeer and Kelvin Ross and Brent Richards and Christoph Bergmeir and Sheng Xu and Dion Stub and Karen Smith and Mark Tacey and Danny Liew and David Pilcher and Kaye, {David M.}",
year = "2018",
month = "11",
day = "1",
doi = "10.1371/journal.pmed.1002709",
language = "English",
volume = "15",
journal = "PLoS Medicine",
issn = "1549-1676",
publisher = "Public Library of Science",
number = "11",

}

Characterising risk of in-hospital mortality following cardiac arrest using machine learning : A retrospective international registry study. / Nanayakkara, Shane; Fogarty, Sam; Tremeer, Michael; Ross, Kelvin; Richards, Brent; Bergmeir, Christoph; Xu, Sheng; Stub, Dion; Smith, Karen; Tacey, Mark; Liew, Danny; Pilcher, David; Kaye, David M.

In: PLoS Medicine, Vol. 15, No. 11, e1002709, 01.11.2018.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Characterising risk of in-hospital mortality following cardiac arrest using machine learning

T2 - A retrospective international registry study

AU - Nanayakkara, Shane

AU - Fogarty, Sam

AU - Tremeer, Michael

AU - Ross, Kelvin

AU - Richards, Brent

AU - Bergmeir, Christoph

AU - Xu, Sheng

AU - Stub, Dion

AU - Smith, Karen

AU - Tacey, Mark

AU - Liew, Danny

AU - Pilcher, David

AU - Kaye, David M.

PY - 2018/11/1

Y1 - 2018/11/1

N2 - Background: Resuscitated cardiac arrest is associated with high mortality; however, the ability to estimate risk of adverse outcomes using existing illness severity scores is limited. Using in-hospital data available within the first 24 hours of admission, we aimed to develop more accurate models of risk prediction using both logistic regression (LR) and machine learning (ML) techniques, with a combination of demographic, physiologic, and biochemical information. Methods and findings: Patient-level data were extracted from the Australian and New Zealand Intensive Care Society (ANZICS) Adult Patient Database for patients who had experienced a cardiac arrest within 24 hours prior to admission to an intensive care unit (ICU) during the period January 2006 to December 2016. The primary outcome was in-hospital mortality. The models were trained and tested on a dataset (split 90:10) including age, lowest and highest physiologic variables during the first 24 hours, and key past medical history. LR and 5 ML approaches (gradient boosting machine [GBM], support vector classifier [SVC], random forest [RF], artificial neural network [ANN], and an ensemble) were compared to the APACHE III and Australian and New Zealand Risk of Death (ANZROD) predictions. In all, 39,566 patients from 186 ICUs were analysed. Mean (±SD) age was 61 ± 17 years; 65% were male. Overall in-hospital mortality was 45.5%. Models were evaluated in the test set. The APACHE III and ANZROD scores demonstrated good discrimination (area under the receiver operating characteristic curve [AUROC] = 0.80 [95% CI 0.79–0.82] and 0.81 [95% CI 0.8–0.82], respectively) and modest calibration (Brier score 0.19 for both), which was slightly improved by LR (AUROC = 0.82 [95% CI 0.81–0.83], DeLong test, p < 0.001). Discrimination was significantly improved using ML models (ensemble and GBM AUROCs = 0.87 [95% CI 0.86–0.88], DeLong test, p < 0.001), with an improvement in performance (Brier score reduction of 22%). Explainability models were created to assist in identifying the physiologic features that most contributed to an individual patient’s survival. Key limitations include the absence of pre-hospital data and absence of external validation. Conclusions: ML approaches significantly enhance predictive discrimination for mortality following cardiac arrest compared to existing illness severity scores and LR, without the use of pre-hospital data. The discriminative ability of these ML models requires validation in external cohorts to establish generalisability.

AB - Background: Resuscitated cardiac arrest is associated with high mortality; however, the ability to estimate risk of adverse outcomes using existing illness severity scores is limited. Using in-hospital data available within the first 24 hours of admission, we aimed to develop more accurate models of risk prediction using both logistic regression (LR) and machine learning (ML) techniques, with a combination of demographic, physiologic, and biochemical information. Methods and findings: Patient-level data were extracted from the Australian and New Zealand Intensive Care Society (ANZICS) Adult Patient Database for patients who had experienced a cardiac arrest within 24 hours prior to admission to an intensive care unit (ICU) during the period January 2006 to December 2016. The primary outcome was in-hospital mortality. The models were trained and tested on a dataset (split 90:10) including age, lowest and highest physiologic variables during the first 24 hours, and key past medical history. LR and 5 ML approaches (gradient boosting machine [GBM], support vector classifier [SVC], random forest [RF], artificial neural network [ANN], and an ensemble) were compared to the APACHE III and Australian and New Zealand Risk of Death (ANZROD) predictions. In all, 39,566 patients from 186 ICUs were analysed. Mean (±SD) age was 61 ± 17 years; 65% were male. Overall in-hospital mortality was 45.5%. Models were evaluated in the test set. The APACHE III and ANZROD scores demonstrated good discrimination (area under the receiver operating characteristic curve [AUROC] = 0.80 [95% CI 0.79–0.82] and 0.81 [95% CI 0.8–0.82], respectively) and modest calibration (Brier score 0.19 for both), which was slightly improved by LR (AUROC = 0.82 [95% CI 0.81–0.83], DeLong test, p < 0.001). Discrimination was significantly improved using ML models (ensemble and GBM AUROCs = 0.87 [95% CI 0.86–0.88], DeLong test, p < 0.001), with an improvement in performance (Brier score reduction of 22%). Explainability models were created to assist in identifying the physiologic features that most contributed to an individual patient’s survival. Key limitations include the absence of pre-hospital data and absence of external validation. Conclusions: ML approaches significantly enhance predictive discrimination for mortality following cardiac arrest compared to existing illness severity scores and LR, without the use of pre-hospital data. The discriminative ability of these ML models requires validation in external cohorts to establish generalisability.

UR - http://www.scopus.com/inward/record.url?scp=85058030466&partnerID=8YFLogxK

U2 - 10.1371/journal.pmed.1002709

DO - 10.1371/journal.pmed.1002709

M3 - Article

VL - 15

JO - PLoS Medicine

JF - PLoS Medicine

SN - 1549-1676

IS - 11

M1 - e1002709

ER -