TY - JOUR
T1 - Learning meaningful latent space representations for patient risk stratification
T2 - model development and validation for dengue and other acute febrile illness
AU - Hernandez, Bernard
AU - Stiff, Oliver
AU - Ming, Damien K.
AU - Ho Quang, Chanh
AU - Nguyen Lam, Vuong
AU - Nguyen Minh, Tuan
AU - Nguyen Van Vinh, Chau
AU - Nguyen Minh, Nguyet
AU - Nguyen Quang, Huy
AU - Phung Khanh, Lam
AU - Dong Thi Hoai, Tam
AU - Dinh The, Trung
AU - Huynh Trung, Trieu
AU - Wills, Bridget
AU - Simmons, Cameron P.
AU - Holmes, Alison H.
AU - Yacoub, Sophie
AU - Georgiou, Pantelis
AU - on behalf of the Vietnam ICU Translational Applications Laboratory (VITAL) Investigators
N1 - Funding Information:
This work was supported by the Wellcome Trust grant (215010/Z/18/Z); DH and BH receive their salaries from and are supported by the grant. The funding source had no role in the design, data collection, analysis or writing of the manuscript. Acknowledgments
Publisher Copyright:
2023 Hernandez, Stiff, Ming, Ho Quang, Nguyen Lam, Nguyen Minh, Nguyen Van Vinh, Nguyen Minh, Nguyen Quang, Phung Khanh, Dong Thi Hoai, Dinh The, Huynh Trung, Wills, Simmons, Holmes, Yacoub and Georgiou.
PY - 2023/2
Y1 - 2023/2
N2 - Background: Increased data availability has prompted the creation of clinical decision support systems. These systems utilise clinical information to enhance health care provision, both to predict the likelihood of specific clinical outcomes or evaluate the risk of further complications. However, their adoption remains low due to concerns regarding the quality of recommendations, and a lack of clarity on how results are best obtained and presented. Methods: We used autoencoders capable of reducing the dimensionality of complex datasets in order to produce a 2D representation denoted as latent space to support understanding of complex clinical data. In this output, meaningful representations of individual patient profiles are spatially mapped in an unsupervised manner according to their input clinical parameters. This technique was then applied to a large real-world clinical dataset of over 12,000 patients with an illness compatible with dengue infection in Ho Chi Minh City, Vietnam between 1999 and 2021. Dengue is a systemic viral disease which exerts significant health and economic burden worldwide, and up to 5% of hospitalised patients develop life-threatening complications. Results: The latent space produced by the selected autoencoder aligns with established clinical characteristics exhibited by patients with dengue infection, as well as features of disease progression. Similar clinical phenotypes are represented close to each other in the latent space and clustered according to outcomes broadly described by the World Health Organisation dengue guidelines. Balancing distance metrics and density metrics produced results covering most of the latent space, and improved visualisation whilst preserving utility, with similar patients grouped closer together. In this case, this balance is achieved by using the sigmoid activation function and one hidden layer with three neurons, in addition to the latent dimension layer, which produces the output (Pearson, 0.840; Spearman, 0.830; Procrustes, 0.301; GMM 0.321). Conclusion: This study demonstrates that when adequately configured, autoencoders can produce two-dimensional representations of a complex dataset that conserve the distance relationship between points. The output visualisation groups patients with clinically relevant features closely together and inherently supports user interpretability. Work is underway to incorporate these findings into an electronic clinical decision support system to guide individual patient management.
AB - Background: Increased data availability has prompted the creation of clinical decision support systems. These systems utilise clinical information to enhance health care provision, both to predict the likelihood of specific clinical outcomes or evaluate the risk of further complications. However, their adoption remains low due to concerns regarding the quality of recommendations, and a lack of clarity on how results are best obtained and presented. Methods: We used autoencoders capable of reducing the dimensionality of complex datasets in order to produce a 2D representation denoted as latent space to support understanding of complex clinical data. In this output, meaningful representations of individual patient profiles are spatially mapped in an unsupervised manner according to their input clinical parameters. This technique was then applied to a large real-world clinical dataset of over 12,000 patients with an illness compatible with dengue infection in Ho Chi Minh City, Vietnam between 1999 and 2021. Dengue is a systemic viral disease which exerts significant health and economic burden worldwide, and up to 5% of hospitalised patients develop life-threatening complications. Results: The latent space produced by the selected autoencoder aligns with established clinical characteristics exhibited by patients with dengue infection, as well as features of disease progression. Similar clinical phenotypes are represented close to each other in the latent space and clustered according to outcomes broadly described by the World Health Organisation dengue guidelines. Balancing distance metrics and density metrics produced results covering most of the latent space, and improved visualisation whilst preserving utility, with similar patients grouped closer together. In this case, this balance is achieved by using the sigmoid activation function and one hidden layer with three neurons, in addition to the latent dimension layer, which produces the output (Pearson, 0.840; Spearman, 0.830; Procrustes, 0.301; GMM 0.321). Conclusion: This study demonstrates that when adequately configured, autoencoders can produce two-dimensional representations of a complex dataset that conserve the distance relationship between points. The output visualisation groups patients with clinically relevant features closely together and inherently supports user interpretability. Work is underway to incorporate these findings into an electronic clinical decision support system to guide individual patient management.
KW - autoencoder (AE) neural networks
KW - clinical decision support system (CDSS)
KW - dengue
KW - similarity retrieval
KW - unsupervised learning
KW - visualisation
UR - https://www.scopus.com/pages/publications/85149725331
U2 - 10.3389/fdgth.2023.1057467
DO - 10.3389/fdgth.2023.1057467
M3 - Article
C2 - 36910574
AN - SCOPUS:85149725331
SN - 2673-253X
VL - 5
JO - Frontiers in Digital Health
JF - Frontiers in Digital Health
M1 - 1057467
ER -