Using Baidu search index to monitor and predict newly diagnosed cases of HIV/AIDS, syphilis and gonorrhea in China: estimates from a vector autoregressive (VAR) model

Ruonan Huang, Ganfeng Luo, Qibin Duan, Lei Zhang, Qingpeng Zhang, Weiming Tang, M. Kumi Smith, Jinghua Li, Huachun Zou

Research output: Contribution to journalArticleResearchpeer-review

3 Citations (Scopus)

Abstract

OBJECTIVES: Internet search engine data have been widely used to monitor and predict infectious diseases. Existing studies have found correlations between search data and HIV/AIDS epidemics. We aimed to extend the literature through exploring the feasibility of using search data to monitor and predict the number of newly diagnosed cases of HIV/AIDS, syphilis and gonorrhoea in China. METHODS: This paper used vector autoregressive model to combine the number of newly diagnosed cases with Baidu search index to predict monthly newly diagnosed cases of HIV/AIDS, syphilis and gonorrhoea in China. The procedures included: (1) keywords selection and filtering; (2) construction of composite search index; (3) modelling with training data from January 2011 to October 2016 and calculating the prediction performance with validation data from November 2016 to October 2017. RESULTS: The analysis showed that there was a close correlation between the monthly number of newly diagnosed cases and the composite search index (the Spearman's rank correlation coefficients were 0.777 for HIV/AIDS, 0.590 for syphilis and 0.633 for gonorrhoea, p<0.05 for all). The R2 were all more than 85% and the mean absolute percentage errors were less than 11%, showing the good fitting effect and prediction performance of vector autoregressive model in this field. CONCLUSIONS: Our study indicated the potential feasibility of using Baidu search data to monitor and predict the number of newly diagnosed cases of HIV/AIDS, syphilis and gonorrhoea in China.

Original languageEnglish
Article numbere036098
Number of pages7
JournalBMJ Open
Volume10
Issue number3
DOIs
Publication statusPublished - 24 Mar 2020

Keywords

  • epidemiology
  • infection control
  • statistics & research methods

Cite this