TY - JOUR
T1 - Genomic mutations and changes in protein secondary structure and solvent accessibility of SARS-CoV-2 (COVID-19 virus)
AU - Nguyen, Thanh Thi
AU - Pathirana, Pubudu N.
AU - Nguyen, Thin
AU - Nguyen, Quoc Viet Hung
AU - Bhatti, Asim
AU - Nguyen, Dinh C.
AU - Nguyen, Dung Tien
AU - Nguyen, Ngoc Duy
AU - Creighton, Douglas
AU - Abdelrazek, Mohamed
PY - 2021/2/10
Y1 - 2021/2/10
N2 - Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly pathogenic virus that has caused the global COVID-19 pandemic. Tracing the evolution and transmission of the virus is crucial to respond to and control the pandemic through appropriate intervention strategies. This paper reports and analyses genomic mutations in the coding regions of SARS-CoV-2 and their probable protein secondary structure and solvent accessibility changes, which are predicted using deep learning models. Prediction results suggest that mutation D614G in the virus spike protein, which has attracted much attention from researchers, is unlikely to make changes in protein secondary structure and relative solvent accessibility. Based on 6324 viral genome sequences, we create a spreadsheet dataset of point mutations that can facilitate the investigation of SARS-CoV-2 in many perspectives, especially in tracing the evolution and worldwide spread of the virus. Our analysis results also show that coding genes E, M, ORF6, ORF7a, ORF7b and ORF10 are most stable, potentially suitable to be targeted for vaccine and drug development.
AB - Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly pathogenic virus that has caused the global COVID-19 pandemic. Tracing the evolution and transmission of the virus is crucial to respond to and control the pandemic through appropriate intervention strategies. This paper reports and analyses genomic mutations in the coding regions of SARS-CoV-2 and their probable protein secondary structure and solvent accessibility changes, which are predicted using deep learning models. Prediction results suggest that mutation D614G in the virus spike protein, which has attracted much attention from researchers, is unlikely to make changes in protein secondary structure and relative solvent accessibility. Based on 6324 viral genome sequences, we create a spreadsheet dataset of point mutations that can facilitate the investigation of SARS-CoV-2 in many perspectives, especially in tracing the evolution and worldwide spread of the virus. Our analysis results also show that coding genes E, M, ORF6, ORF7a, ORF7b and ORF10 are most stable, potentially suitable to be targeted for vaccine and drug development.
UR - http://www.scopus.com/inward/record.url?scp=85101468226&partnerID=8YFLogxK
U2 - 10.1038/s41598-021-83105-3
DO - 10.1038/s41598-021-83105-3
M3 - Article
C2 - 33568759
AN - SCOPUS:85101468226
SN - 2045-2322
VL - 11
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 3487
ER -