TY - JOUR
T1 - Clustering of countries for COVID-19 cases based on disease prevalence, health systems and environmental indicators
AU - Rizvi, Syeda Amna
AU - Umair, Muhammad
AU - Cheema, Muhammad Aamir
N1 - Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2021/10
Y1 - 2021/10
N2 - The coronavirus has a high basic reproduction number (R0) and has caused the global COVID-19 pandemic. Governments are implementing lockdowns that are leading to economic fallout in many countries. Policy makers can take better decisions if provided with the indicators connected with the disease spread. This study is aimed to cluster the countries using social, economic, health and environmental related metrics affecting the disease spread so as to implement the policies to control the widespread of disease. Thus, countries with similar factors can take proactive steps to fight against the pandemic. The data is acquired for 79 countries and 18 different feature variables (the factors that are associated with COVID-19 spread) are selected. Pearson Product Moment Correlation Analysis is performed between all the feature variables with cumulative death cases and cumulative confirmed cases individually to get an insight of relation of these factors with the spread of COVID-19. Unsupervised k-means algorithm is used and the feature set includes economic, environmental indicators and disease prevalence along with COVID-19 variables. The learning model is able to group the countries into 4 clusters on the basis of relation with all 18 feature variables. We also present an analysis of correlation between the selected feature variables, and COVID-19 confirmed cases and deaths. Prevalence of underlying diseases shows strong correlation with COVID-19 whereas environmental health indicators are weakly correlated with COVID-19.
AB - The coronavirus has a high basic reproduction number (R0) and has caused the global COVID-19 pandemic. Governments are implementing lockdowns that are leading to economic fallout in many countries. Policy makers can take better decisions if provided with the indicators connected with the disease spread. This study is aimed to cluster the countries using social, economic, health and environmental related metrics affecting the disease spread so as to implement the policies to control the widespread of disease. Thus, countries with similar factors can take proactive steps to fight against the pandemic. The data is acquired for 79 countries and 18 different feature variables (the factors that are associated with COVID-19 spread) are selected. Pearson Product Moment Correlation Analysis is performed between all the feature variables with cumulative death cases and cumulative confirmed cases individually to get an insight of relation of these factors with the spread of COVID-19. Unsupervised k-means algorithm is used and the feature set includes economic, environmental indicators and disease prevalence along with COVID-19 variables. The learning model is able to group the countries into 4 clusters on the basis of relation with all 18 feature variables. We also present an analysis of correlation between the selected feature variables, and COVID-19 confirmed cases and deaths. Prevalence of underlying diseases shows strong correlation with COVID-19 whereas environmental health indicators are weakly correlated with COVID-19.
KW - Clustering methods
KW - COVID-19
KW - COVID-19 confirmed cases
KW - COVID-19 death cases
KW - Disease prevalence
KW - K-Means
KW - Pearson correlation
KW - Second wave
KW - Unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85111001315&partnerID=8YFLogxK
U2 - 10.1016/j.chaos.2021.111240
DO - 10.1016/j.chaos.2021.111240
M3 - Article
AN - SCOPUS:85111001315
VL - 151
JO - Chaos, Solitons and Fractals
JF - Chaos, Solitons and Fractals
SN - 0960-0779
M1 - 111240
ER -