TY - JOUR
T1 - Identification of Key Influencers for Secondary Distribution of HIV Self-Testing Kits Among Chinese Men Who Have Sex With Men
T2 - Development of an Ensemble Machine Learning Approach
AU - Jing, Fengshi
AU - Ye, Yang
AU - Zhou, Yi
AU - Ni, Yuxin
AU - Yan, Xumeng
AU - Lu, Ying
AU - Ong, Jason
AU - Tucker, Joseph D.
AU - Wu, Dan
AU - Xiong, Yuan
AU - Xu, Chen
AU - He, Xi
AU - Huang, Shanzi
AU - Li, Xiaofeng
AU - Jiang, Hongbo
AU - Wang, Cheng
AU - Dai, Wencan
AU - Huang, Liqun
AU - Mei, Wenhua
AU - Cheng, Weibin
AU - Zhang, Qingpeng
AU - Tang, Weiming
N1 - Publisher Copyright:
©Fengshi Jing, Yang Ye, Yi Zhou, Yuxin Ni, Xumeng Yan, Ying Lu, Jason Ong, Joseph D Tucker, Dan Wu, Yuan Xiong, Chen Xu, Xi He, Shanzi Huang, Xiaofeng Li, Hongbo Jiang, Cheng Wang, Wencan Dai, Liqun Huang, Wenhua Mei, Weibin Cheng, Qingpeng Zhang, Weiming Tang. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 23.11.2023.
PY - 2023/11/23
Y1 - 2023/11/23
N2 - BACKGROUND: HIV self-testing (HIVST) has been rapidly scaled up and additional strategies further expand testing uptake. Secondary distribution involves people (defined as "indexes") applying for multiple kits and subsequently sharing them with people (defined as "alters") in their social networks. However, identifying key influencers is difficult. OBJECTIVE: This study aimed to develop an innovative ensemble machine learning approach to identify key influencers among Chinese men who have sex with men (MSM) for secondary distribution of HIVST kits. METHODS: We defined three types of key influencers: (1) key distributors who can distribute more kits, (2) key promoters who can contribute to finding first-time testing alters, and (3) key detectors who can help to find positive alters. Four machine learning models (logistic regression, support vector machine, decision tree, and random forest) were trained to identify key influencers. An ensemble learning algorithm was adopted to combine these 4 models. For comparison with our machine learning models, self-evaluated leadership scales were used as the human identification approach. Four metrics for performance evaluation, including accuracy, precision, recall, and F1-score, were used to evaluate the machine learning models and the human identification approach. Simulation experiments were carried out to validate our approach. RESULTS: We included 309 indexes (our sample size) who were eligible and applied for multiple test kits; they distributed these kits to 269 alters. We compared the performance of the machine learning classification and ensemble learning models with that of the human identification approach based on leadership self-evaluated scales in terms of the 2 nearest cutoffs. Our approach outperformed human identification (based on the cutoff of the self-reported scales), exceeding by an average accuracy of 11.0%, could distribute 18.2% (95% CI 9.9%-26.5%) more kits, and find 13.6% (95% CI 1.9%-25.3%) more first-time testing alters and 12.0% (95% CI -14.7% to 38.7%) more positive-testing alters. Our approach could also increase the simulated intervention's efficiency by 17.7% (95% CI -3.5% to 38.8%) compared to that of human identification. CONCLUSIONS: We built machine learning models to identify key influencers among Chinese MSM who were more likely to engage in secondary distribution of HIVST kits. TRIAL REGISTRATION: Chinese Clinical Trial Registry (ChiCTR) ChiCTR1900025433; https://www.chictr.org.cn/showproj.html?proj=42001.
AB - BACKGROUND: HIV self-testing (HIVST) has been rapidly scaled up and additional strategies further expand testing uptake. Secondary distribution involves people (defined as "indexes") applying for multiple kits and subsequently sharing them with people (defined as "alters") in their social networks. However, identifying key influencers is difficult. OBJECTIVE: This study aimed to develop an innovative ensemble machine learning approach to identify key influencers among Chinese men who have sex with men (MSM) for secondary distribution of HIVST kits. METHODS: We defined three types of key influencers: (1) key distributors who can distribute more kits, (2) key promoters who can contribute to finding first-time testing alters, and (3) key detectors who can help to find positive alters. Four machine learning models (logistic regression, support vector machine, decision tree, and random forest) were trained to identify key influencers. An ensemble learning algorithm was adopted to combine these 4 models. For comparison with our machine learning models, self-evaluated leadership scales were used as the human identification approach. Four metrics for performance evaluation, including accuracy, precision, recall, and F1-score, were used to evaluate the machine learning models and the human identification approach. Simulation experiments were carried out to validate our approach. RESULTS: We included 309 indexes (our sample size) who were eligible and applied for multiple test kits; they distributed these kits to 269 alters. We compared the performance of the machine learning classification and ensemble learning models with that of the human identification approach based on leadership self-evaluated scales in terms of the 2 nearest cutoffs. Our approach outperformed human identification (based on the cutoff of the self-reported scales), exceeding by an average accuracy of 11.0%, could distribute 18.2% (95% CI 9.9%-26.5%) more kits, and find 13.6% (95% CI 1.9%-25.3%) more first-time testing alters and 12.0% (95% CI -14.7% to 38.7%) more positive-testing alters. Our approach could also increase the simulated intervention's efficiency by 17.7% (95% CI -3.5% to 38.8%) compared to that of human identification. CONCLUSIONS: We built machine learning models to identify key influencers among Chinese MSM who were more likely to engage in secondary distribution of HIVST kits. TRIAL REGISTRATION: Chinese Clinical Trial Registry (ChiCTR) ChiCTR1900025433; https://www.chictr.org.cn/showproj.html?proj=42001.
KW - HIV self-testing
KW - key influencers identification
KW - machine learning
KW - men who have sex with men
KW - MSM
KW - secondary distribution
UR - http://www.scopus.com/inward/record.url?scp=85177790186&partnerID=8YFLogxK
U2 - 10.2196/37719
DO - 10.2196/37719
M3 - Article
C2 - 37995110
AN - SCOPUS:85177790186
SN - 1439-4456
VL - 25
JO - Journal of Medical Internet Research
JF - Journal of Medical Internet Research
M1 - e37719
ER -