Toward Electronic Surveillance of Invasive Mold Diseases in Hematology-Oncology Patients: An Expert System Combining Natural Language Processing of Chest Computed Tomography Reports, Microbiology, and Antifungal Drug Data

Research output: Contribution to journalArticleResearch

Abstract

Purpose
Prospective epidemiologic surveillance of invasive mold disease (IMD) in hematology patients is hampered by the absence of a reliable laboratory prompt. This study develops an expert system for electronic surveillance of IMD that combines probabilities using natural language processing (NLP) of computed tomography (CT) reports with microbiology and antifungal drug data to improve prediction of IMD.

Methods
Microbiology indicators and antifungal drug–dispensing data were extracted from hospital information systems at three tertiary hospitals for 123 hematology-oncology patients. Of this group, 64 case patients had 26 probable/proven IMD according to international definitions, and 59 patients were uninfected controls. Derived probabilities from NLP combined with medical expertise identified patients at high likelihood of IMD, with remaining patients processed by a machine-learning classifier trained on all available features.

Results
Compared with the baseline text classifier, the expert system that incorporated the best performing algorithm (naïve Bayes) improved specificity from 50.8% (95% CI, 37.5% to 64.1%) to 74.6% (95% CI, 61.6% to 85.0%), reducing false positives by 48% from 29 to 15; improved sensitivity slightly from 96.9% (95% CI, 89.2% to 99.6%) to 98.4% (95% CI, 91.6% to 100%); and improved receiver operating characteristic area from 73.9% (95% CI, 67.1% to 80.6%) to 92.8% (95% CI, 88% to 97.5%).

Conclusion
An expert system that uses multiple sources of data (CT reports, microbiology, antifungal drug dispensing) is a promising approach to continuous prospective surveillance of IMD in the hospital, and demonstrates reduced false notifications (positives) compared with NLP of CT reports alone. Our expert system could provide decision support for IMD surveillance, which is critical to antifungal stewardship and improving supportive care in cancer.
Original languageEnglish
Pages (from-to)1-10
Number of pages10
JournalJCO Clinical Cancer Informatics
DOIs
Publication statusAccepted/In press - 30 Aug 2017

Cite this

@article{338dc17540334d33b74b5c94fb4928fb,
title = "Toward Electronic Surveillance of Invasive Mold Diseases in Hematology-Oncology Patients: An Expert System Combining Natural Language Processing of Chest Computed Tomography Reports, Microbiology, and Antifungal Drug Data",
abstract = "PurposeProspective epidemiologic surveillance of invasive mold disease (IMD) in hematology patients is hampered by the absence of a reliable laboratory prompt. This study develops an expert system for electronic surveillance of IMD that combines probabilities using natural language processing (NLP) of computed tomography (CT) reports with microbiology and antifungal drug data to improve prediction of IMD.MethodsMicrobiology indicators and antifungal drug–dispensing data were extracted from hospital information systems at three tertiary hospitals for 123 hematology-oncology patients. Of this group, 64 case patients had 26 probable/proven IMD according to international definitions, and 59 patients were uninfected controls. Derived probabilities from NLP combined with medical expertise identified patients at high likelihood of IMD, with remaining patients processed by a machine-learning classifier trained on all available features.ResultsCompared with the baseline text classifier, the expert system that incorporated the best performing algorithm (na{\"i}ve Bayes) improved specificity from 50.8{\%} (95{\%} CI, 37.5{\%} to 64.1{\%}) to 74.6{\%} (95{\%} CI, 61.6{\%} to 85.0{\%}), reducing false positives by 48{\%} from 29 to 15; improved sensitivity slightly from 96.9{\%} (95{\%} CI, 89.2{\%} to 99.6{\%}) to 98.4{\%} (95{\%} CI, 91.6{\%} to 100{\%}); and improved receiver operating characteristic area from 73.9{\%} (95{\%} CI, 67.1{\%} to 80.6{\%}) to 92.8{\%} (95{\%} CI, 88{\%} to 97.5{\%}).ConclusionAn expert system that uses multiple sources of data (CT reports, microbiology, antifungal drug dispensing) is a promising approach to continuous prospective surveillance of IMD in the hospital, and demonstrates reduced false notifications (positives) compared with NLP of CT reports alone. Our expert system could provide decision support for IMD surveillance, which is critical to antifungal stewardship and improving supportive care in cancer.",
author = "Michelle Ananda-Rajah and Bergmeir, {Christoph Norbert} and Francois Petitjean and Slavin, {Monica A.} and Thursky, {Karin A.} and Webb, {Geoffrey I.}",
year = "2017",
month = "8",
day = "30",
doi = "10.1200/CCI.17.00011",
language = "English",
pages = "1--10",
journal = "JCO Clinical Cancer Informatics",
issn = "2473-4276",
publisher = "American Society of Clinical Oncology",

}

TY - JOUR

T1 - Toward Electronic Surveillance of Invasive Mold Diseases in Hematology-Oncology Patients

T2 - An Expert System Combining Natural Language Processing of Chest Computed Tomography Reports, Microbiology, and Antifungal Drug Data

AU - Ananda-Rajah, Michelle

AU - Bergmeir, Christoph Norbert

AU - Petitjean, Francois

AU - Slavin, Monica A.

AU - Thursky, Karin A.

AU - Webb, Geoffrey I.

PY - 2017/8/30

Y1 - 2017/8/30

N2 - PurposeProspective epidemiologic surveillance of invasive mold disease (IMD) in hematology patients is hampered by the absence of a reliable laboratory prompt. This study develops an expert system for electronic surveillance of IMD that combines probabilities using natural language processing (NLP) of computed tomography (CT) reports with microbiology and antifungal drug data to improve prediction of IMD.MethodsMicrobiology indicators and antifungal drug–dispensing data were extracted from hospital information systems at three tertiary hospitals for 123 hematology-oncology patients. Of this group, 64 case patients had 26 probable/proven IMD according to international definitions, and 59 patients were uninfected controls. Derived probabilities from NLP combined with medical expertise identified patients at high likelihood of IMD, with remaining patients processed by a machine-learning classifier trained on all available features.ResultsCompared with the baseline text classifier, the expert system that incorporated the best performing algorithm (naïve Bayes) improved specificity from 50.8% (95% CI, 37.5% to 64.1%) to 74.6% (95% CI, 61.6% to 85.0%), reducing false positives by 48% from 29 to 15; improved sensitivity slightly from 96.9% (95% CI, 89.2% to 99.6%) to 98.4% (95% CI, 91.6% to 100%); and improved receiver operating characteristic area from 73.9% (95% CI, 67.1% to 80.6%) to 92.8% (95% CI, 88% to 97.5%).ConclusionAn expert system that uses multiple sources of data (CT reports, microbiology, antifungal drug dispensing) is a promising approach to continuous prospective surveillance of IMD in the hospital, and demonstrates reduced false notifications (positives) compared with NLP of CT reports alone. Our expert system could provide decision support for IMD surveillance, which is critical to antifungal stewardship and improving supportive care in cancer.

AB - PurposeProspective epidemiologic surveillance of invasive mold disease (IMD) in hematology patients is hampered by the absence of a reliable laboratory prompt. This study develops an expert system for electronic surveillance of IMD that combines probabilities using natural language processing (NLP) of computed tomography (CT) reports with microbiology and antifungal drug data to improve prediction of IMD.MethodsMicrobiology indicators and antifungal drug–dispensing data were extracted from hospital information systems at three tertiary hospitals for 123 hematology-oncology patients. Of this group, 64 case patients had 26 probable/proven IMD according to international definitions, and 59 patients were uninfected controls. Derived probabilities from NLP combined with medical expertise identified patients at high likelihood of IMD, with remaining patients processed by a machine-learning classifier trained on all available features.ResultsCompared with the baseline text classifier, the expert system that incorporated the best performing algorithm (naïve Bayes) improved specificity from 50.8% (95% CI, 37.5% to 64.1%) to 74.6% (95% CI, 61.6% to 85.0%), reducing false positives by 48% from 29 to 15; improved sensitivity slightly from 96.9% (95% CI, 89.2% to 99.6%) to 98.4% (95% CI, 91.6% to 100%); and improved receiver operating characteristic area from 73.9% (95% CI, 67.1% to 80.6%) to 92.8% (95% CI, 88% to 97.5%).ConclusionAn expert system that uses multiple sources of data (CT reports, microbiology, antifungal drug dispensing) is a promising approach to continuous prospective surveillance of IMD in the hospital, and demonstrates reduced false notifications (positives) compared with NLP of CT reports alone. Our expert system could provide decision support for IMD surveillance, which is critical to antifungal stewardship and improving supportive care in cancer.

U2 - 10.1200/CCI.17.00011

DO - 10.1200/CCI.17.00011

M3 - Article

SP - 1

EP - 10

JO - JCO Clinical Cancer Informatics

JF - JCO Clinical Cancer Informatics

SN - 2473-4276

ER -