TY - JOUR
T1 - Predicting the 10-year risk of cataract surgery using machine learning techniques on questionnaire data
T2 - Findings from the 45 and up Study
AU - Wang, Wei
AU - Han, Xiaotong
AU - Zhang, Jiaqing
AU - Shang, Xianwen
AU - Ha, Jason
AU - Liu, Zhenzhen
AU - Zhang, Lei
AU - Luo, Lixia
AU - He, Mingguang
N1 - Publisher Copyright:
© Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.
PY - 2022/11
Y1 - 2022/11
N2 - Background/aims: To investigate the feasibility and accuracy of using machine learning (ML) techniques on self-reported questionnaire data to predict the 10-year risk of cataract surgery, and to identify meaningful predictors of cataract surgery in middle-aged and older Australians. Methods: Baseline information regarding demographic, socioeconomic, medical history and family history, lifestyle, dietary and self-rated health status were collected as risk factors. Cataract surgery events were confirmed by the Medicare Benefits Schedule Claims dataset. Three ML algorithms (random forests [RF], gradient boosting machine and deep learning) and one traditional regression algorithm (logistic model) were compared on the accuracy of their predictions for the risk of cataract surgery. The performance was assessed using 10-fold cross-validation. The main outcome measures were areas under the receiver operating characteristic curves (AUCs). Results: In total, 207 573 participants, aged 45 years and above without a history of cataract surgery at baseline, were recruited from the 45 and Up Study. The performance of gradient boosting machine (AUC 0.790, 95% CI 0.785 to 0.795), RF (AUC 0.785, 95% CI 0.780 to 0.790) and deep learning (AUC 0.781, 95% CI 0.775 to 61 0.786) were robust and outperformed the traditional logistic regression method (AUC 0.767, 95% CI 0.762 to 0.773, all p<0.05). Age, self-rated eye vision and health insurance were consistently identified as important predictors in all models. Conclusions: The study demonstrated that ML modelling was able to reasonably accurately predict the 10-year risk of cataract surgery based on questionnaire data alone and was marginally superior to the conventional logistic model.
AB - Background/aims: To investigate the feasibility and accuracy of using machine learning (ML) techniques on self-reported questionnaire data to predict the 10-year risk of cataract surgery, and to identify meaningful predictors of cataract surgery in middle-aged and older Australians. Methods: Baseline information regarding demographic, socioeconomic, medical history and family history, lifestyle, dietary and self-rated health status were collected as risk factors. Cataract surgery events were confirmed by the Medicare Benefits Schedule Claims dataset. Three ML algorithms (random forests [RF], gradient boosting machine and deep learning) and one traditional regression algorithm (logistic model) were compared on the accuracy of their predictions for the risk of cataract surgery. The performance was assessed using 10-fold cross-validation. The main outcome measures were areas under the receiver operating characteristic curves (AUCs). Results: In total, 207 573 participants, aged 45 years and above without a history of cataract surgery at baseline, were recruited from the 45 and Up Study. The performance of gradient boosting machine (AUC 0.790, 95% CI 0.785 to 0.795), RF (AUC 0.785, 95% CI 0.780 to 0.790) and deep learning (AUC 0.781, 95% CI 0.775 to 61 0.786) were robust and outperformed the traditional logistic regression method (AUC 0.767, 95% CI 0.762 to 0.773, all p<0.05). Age, self-rated eye vision and health insurance were consistently identified as important predictors in all models. Conclusions: The study demonstrated that ML modelling was able to reasonably accurately predict the 10-year risk of cataract surgery based on questionnaire data alone and was marginally superior to the conventional logistic model.
KW - diagnostic tests/investigation
KW - epidemiology
KW - public health
UR - http://www.scopus.com/inward/record.url?scp=85106958674&partnerID=8YFLogxK
U2 - 10.1136/bjophthalmol-2020-318609
DO - 10.1136/bjophthalmol-2020-318609
M3 - Article
C2 - 34039562
AN - SCOPUS:85106958674
SN - 0007-1161
VL - 106
SP - 1503
EP - 1507
JO - British Journal of Ophthalmology
JF - British Journal of Ophthalmology
IS - 11
ER -