Open Access Open Access  Restricted Access Subscription Access

Genomics of Indian SARS-CoV-2: Implications in Genetic Diversity, Possible Origin and Spread of Virus


Affiliations
1 Department of Microbiology and Cell Biology, Indian Institute of Science, Bengaluru 560 012, India
 

World Health Organization (WHO) declared COVID- 19 as a pandemic disease on 11 March 2020. Comparison of genome sequences from diverse locations allows us to identify the genetic diversity among viruses which would help in ascertaining viral virulence, disease pathogenicity, origin and spread of the SARSCoV- 2 between countries. The aim of this study is to determine the genetic diversity among Indian SARSCoV- 2 isolates. Initial examination of the phylogenetic data of SARS-CoV-2 genomes (n = 3123) from different continents deposited at GISAID (Global Initiative on Sharing All Influenza Data) revealed multiple origin for Indian isolates. An in-depth analysis of 558 viral genomes derived from samples representing countries from USA, Europe, China, East Asia, South Asia, Oceania, Middle East regions and India revealed that most Indian samples are divided into two clusters. A1 sub-cluster showed more similarity to Oceania and Kuwait samples, while A2 sub-cluster grouped with South Asian samples. In contrast, cluster B grouped with countries from Europe, Middle East and South Asia. Viral clade analysis of Indian samples revealed a high occurrence of G clade (D614G in spike protein; 37%), which is a European clade, followed by I clade (V378I in ORF1ab; 12%), which is an Oceania clade with samples having Iran connections. While A1 cluster is enriched with I clade, the cluster B is enriched with G clade type. Thus our study identifies that the Indian SARS-CoV-2 viruses are enriched with G and I clades in addition to 50% samples with unknown genetic variations. The potential origin to be countries mainly from Europe, Middle East Oceania and South Asia regions, which strongly imply the spread of virus through most travelled countries. The study also emphasizes the importance of pathogen genomics through phylogenetic analysis to discover viral genetic diversity and understand the viral transmission dynamics with eventual grasp on viral virulence and disease pathogenesis.

Keywords

COVID-19, Genetic Diversity, Pandemic, SAR-CoV-2, Severe Acute Respiratory Syndrome.
User
Notifications
Font Size

  • Guo, Y. R. et al., The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak – an update on the status. Mil. Med. Res., 2020, 7(1), 11; doi:10.1186/s40779020-00240-0.
  • https://www.arcgis.com/
  • https://www.mohfw.gov.in/
  • https://www.gisaid.org/
  • Yadav, P. D. et al., Full-genome sequences of the first two SARSCoV2 viruses from India. Indian J. Med. Res., 2020; doi: 10.4103/ijmr.IJMR_663_20 [Epub ahead of print].
  • Sardar, R., Satish, D., Birla, S. and Gupta, D., Comparative analyses of SAR-CoV2 genomes from different geographical locations and other coronavirus family genomes reveals unique features potentially consequential to host–virus interaction and pathogenesis. bioRxiv: 2020.03.21.001586; doi:https://doi.org/10.1101/2020.03.21.001586.
  • Miller, M. A., Pfeiffer, W. and Schwartz, T., Creating the CIPRES science gateway for inference of large phylogenetic trees. In 2010 Gateway Computing Environments Workshop (GCE), 2010, pp. 1–8.
  • Trifinopoulos, J., Nguyen, L. T., von Haeseler, A. and Minh, B. Q., W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis. Nucl. Acids Res., 2016, 44(W1), W232– W235; https://doi.org/10.1093/nar/gkw256.
  • http://tree.bio.ed.ac.uk/software/figtree/
  • Madeira, F. et al., The EMBL-EBI search and sequence analysis tools APIs in 2019. Nuc. Acids Res., 2019, 47(W1), W636–W641; doi:10.1093/nar/gkz268.
  • Hall, T. A., BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl. Acids Symp. Ser., 1999, 41, 95–98.
  • Zhou, P. et al., A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature, 2020, 579(7798), 270– 273; doi:10.1038/s41586-020-2012-7. Epub 3 February 2020.
  • Elbe, S. and Buckland-Merrett, G., Data, disease and diplomacy: GISAID’s innovative contribution to global health. Global Chall., 2017, 1(1), 33–46; doi:10.1002/gch2.1018. eCollection.
  • Eden, J. S. et al., An emergent clade of SARS-CoV-2 linked to returned travellers from Iran. bioRxiv: 2020.03.15.992818; doi: https://doi.org/10.1101/2020.03.15.992818.
  • Brufsky, A., Distinct viral clades of SARS-CoV-2: implications for modeling of viral spread. J. Med. Virol., 2020; doi: 10.1002/jmv.25902 [Epub ahead of print].
  • Wu, F. et al., A new coronavirus associated with human respiratory disease in China. Nature, 2020, 579, 265–269; https://doi.org/10.1038/s41586-020-2008-3.
  • Andersen, K. G., Rambaut, A., Lipkin, W. I., Holmes, E. C. and Garry, R. F., The proximal origin of SARS-CoV-2. Nat. Med., 2020, 26, 450–452; https://doi.org/10.1038/s41591-020-0820-9.
  • Korber, B. et al., Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2. bioRxiv: 2020.05.05; https://doi.org/10.1101/2020.04.29.069054.
  • Bhattacharyya, C., Das, C., Ghosh, A., Singh, A. K., Mukherjee, S., Majumder, P. P., Basu, A. and Biswas, N. K., Global spread of SARS-CoV-2 subtype with spike protein mutation D614G is shaped by human genomic variations that regulate expression of TMPRSS2 and MX1 genes. bioRxiv: 2020.05.05; https://doi.org/10.1101/2020.05.04.075911.
  • Krueger, D. K., Kelly, S. M., Lewicki, D. N., Ruffolo, R. and Gallagher, T. M., Variations in disparate regions of the murine coronavirus spike protein impact the initiation of membrane fusion. J. Virol., 2001, 75(6), 2792–2802; doi:10.1128/JVI.75.6.27922802.2001.
  • Geoghegan, J. L. and Holmes, E. C., The phylogenomics of evolving virus virulence. Nat. Rev. Genet., 2018, 19, 756–769; https://doi.org/10.1038/s41576-018-0055-5.
  • Ontiveros, E., Enhanced virulence mediated by the murine coronavirus, mouse hepatitis virus strain JHM, is associated with a glycine at residue 310 of the spike glycoprotein. J. Virol., 2003, 77(19), 10260–10269; doi:10.1128/jvi.77.19.10260-10269.2003.
  • Dearlove, B. et al., A SARS-CoV-2 vaccine candidate would likely match all currently circulating strains. bioRxiv: 2020.04.27; https://doi.org/10.1101/2020.04.27.064774.
  • Biswas, N. K. and Majumder, P. P., Analysis of RNA sequences of 3636 SARS-CoV-2 collected from 55 countries reveals selective sweep of one virus type. Indian J. Med. Res., Special issue on COVID-19 (in press, 28 April 2020).

Abstract Views: 405

PDF Views: 133




  • Genomics of Indian SARS-CoV-2: Implications in Genetic Diversity, Possible Origin and Spread of Virus

Abstract Views: 405  |  PDF Views: 133

Authors

Mainak Mondal
Department of Microbiology and Cell Biology, Indian Institute of Science, Bengaluru 560 012, India
Ankita Lawarde
Department of Microbiology and Cell Biology, Indian Institute of Science, Bengaluru 560 012, India
Kumaravel Somasundaram
Department of Microbiology and Cell Biology, Indian Institute of Science, Bengaluru 560 012, India

Abstract


World Health Organization (WHO) declared COVID- 19 as a pandemic disease on 11 March 2020. Comparison of genome sequences from diverse locations allows us to identify the genetic diversity among viruses which would help in ascertaining viral virulence, disease pathogenicity, origin and spread of the SARSCoV- 2 between countries. The aim of this study is to determine the genetic diversity among Indian SARSCoV- 2 isolates. Initial examination of the phylogenetic data of SARS-CoV-2 genomes (n = 3123) from different continents deposited at GISAID (Global Initiative on Sharing All Influenza Data) revealed multiple origin for Indian isolates. An in-depth analysis of 558 viral genomes derived from samples representing countries from USA, Europe, China, East Asia, South Asia, Oceania, Middle East regions and India revealed that most Indian samples are divided into two clusters. A1 sub-cluster showed more similarity to Oceania and Kuwait samples, while A2 sub-cluster grouped with South Asian samples. In contrast, cluster B grouped with countries from Europe, Middle East and South Asia. Viral clade analysis of Indian samples revealed a high occurrence of G clade (D614G in spike protein; 37%), which is a European clade, followed by I clade (V378I in ORF1ab; 12%), which is an Oceania clade with samples having Iran connections. While A1 cluster is enriched with I clade, the cluster B is enriched with G clade type. Thus our study identifies that the Indian SARS-CoV-2 viruses are enriched with G and I clades in addition to 50% samples with unknown genetic variations. The potential origin to be countries mainly from Europe, Middle East Oceania and South Asia regions, which strongly imply the spread of virus through most travelled countries. The study also emphasizes the importance of pathogen genomics through phylogenetic analysis to discover viral genetic diversity and understand the viral transmission dynamics with eventual grasp on viral virulence and disease pathogenesis.

Keywords


COVID-19, Genetic Diversity, Pandemic, SAR-CoV-2, Severe Acute Respiratory Syndrome.

References





DOI: https://doi.org/10.18520/cs%2Fv118%2Fi11%2F1786-1791