Open Access Open Access  Restricted Access Subscription Access

Role of statistics in the era of data science


Affiliations
1 Chennai Mathematical Institute, Chennai 603 103, India, India
 

Statistics evolved as a science in an era when the amount of data available was small and efforts were on to extract maximum information from them. Are the techniques developed during those times relevant anymore in the era of data science? We will illustrate using examples that several statistical concepts developed over the last 150 years are as relevant in this era as they were then

Keywords

Analytics, big data, bias, data-science, regression, statistics.
User
Notifications
Font Size

  • Galton, F., Hereditary Genius: An Inquiry into its Laws and Consequences, Macmillan, London, UK, 1869.
  • Galton, F., Regression towards mediocrity in hereditary stature. J. Anthropol. Inst. G.B. Ireland, 1886, 15, 246–263.
  • Gallon, F., Typical laws of heredity. Proc. R. Inst., 1877, 8, 282–301.
  • Galton, F., Family likeness in stature. Proc. R. Soc., London, 1886, 40, 42–73; Includes appendix by J. D. Hamilton Dickson, ibid., 63–66.
  • Karandikar, R. L., Mathematics and elections. Proc. Indian Natl. Sci. Acad., 2020, 86, 1461–1479.
  • Demoulin, C. and Embrechts, P., Revisiting the edge, ten years on. Commun. Stat. – Theory Meth., 2010, 39, 1674–1688.
  • Donnelly, C. and Embrechts, P., The devil is in the tails: actuarial mathematics and the subprime mortgage crisis. ASTIN Bull., 2010, 40, 1–33.
  • Embrechts, P., Did a mathematical formula really blow up wall street? https://www.actuaries.org/ASTIN/Colloquia/Helsinki/Presentations/Embrechts.pdf (accessed on 9 February 2021).
  • Yule, G. U., Why do we sometimes get nonsense-correlations between time-series? – a study in sampling and the nature of timeseries. J. R. Stat. Soc., 1926, 89, 1–63.
  • Chatterjee, B., Karandikar, R. L. and Mande, S. C. Mortality due to COVID-19 in different countries is associated with their demographic character and prevalence of autoimmunity. Curr. Sci., 2021, 120, 501–508.
  • Hassler, U. and Thadewald, T., Nonsensical and biased correlation due to pooling heterogeneous samples. J. R. Stat. Soc., Ser. D, 2003, 52, 367–379.
  • Good, I. J. and Mittal, Y., The amalgamation and geometry of two-by-two contingency tables. Ann. Stat., 1987, 15, 694–711.
  • Pearson, K., Lee, A. and Bramley-Moore, L., Genetic (reproductive) selection: inheritance of fertility in man and of fecundity in thoroughbred racehorses. Philos. Trans. R. Soc. A, 1899, 192, 257–330.
  • Simpson, E. H., The interpretation of interaction in contingency tables. J. R. Stat. Soc., 1951, 13, 238–241.
  • Yule, G. U., Notes on the theory of association of attributes in statistics. Biometrika, 1903, 2, 121–134.
  • Bickel, P. J., Hammel, E. A. and O’Connell, J. W., Sex bias in graduate admissions: data from Berkeley. Science, 1975, 187(4175), 398–404.
  • Badyal, D. et al., Hydroxychloroquine for SARS CoV2 prophylaxis in healthcare workers – a multicentric cohort study assessing effectiveness and safety. J. Assoc. Physicians India, 2021, 69(6), 11–12;
  • https://www.japi.org/x284d434/hydroxychloroquine-for-sars-cov2-prophylaxis-in-healthcare-workers-ndash-a-multicentric-cohortstudy-assessing-effectiveness-and-safety
  • Smith, L. G. et al., Observational Study on 255 Mechanically Ventilated Covid Patients at the Beginning of the USA Pandemic, 2021; medRxiv preprint doi:https://doi.org/10.1101/2021.05.28.21258012 (accessed on 31 May 2021).
  • Ellenberg, J., How not to be Wrong: The Power of Mathematical Thinking, Penguin Press, 2014.
  • Wallis, W. A., The statistical research group, 1942–1945. J. Am. Stat. Assoc., 1980, 75(370), 320–330.

Abstract Views: 383

PDF Views: 142




  • Role of statistics in the era of data science

Abstract Views: 383  |  PDF Views: 142

Authors

Rajeeva L. Karandikar
Chennai Mathematical Institute, Chennai 603 103, India, India

Abstract


Statistics evolved as a science in an era when the amount of data available was small and efforts were on to extract maximum information from them. Are the techniques developed during those times relevant anymore in the era of data science? We will illustrate using examples that several statistical concepts developed over the last 150 years are as relevant in this era as they were then

Keywords


Analytics, big data, bias, data-science, regression, statistics.

References





DOI: https://doi.org/10.18520/cs%2Fv121%2Fi8%2F1016-1021