Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Data Mining and Analysis by R Language for Business Research: A Case Study on Stress and its Influence on Health


Affiliations
1 Business Analytics, Dhruva College of Management, Hyderabad, India
     

   Subscribe/Renew Journal


R is not only a statistical suite but also efficient data mining software for data manipulation, calculation and graphical display. In fact, R being a language also has an effective data handling and storage facility. Besides having a suite of operators for calculations on arrays, in particular matrices. The R is developed from a simple and effective programming language (called “S”) which includes conditionals, loops; user defined recursive functions and input and output facilities. Methods: In this paper the data mining capabilities of R has been explained with the help of a study on secondary data sources, obtained from certain authenticated sources. The study is all about to understand stress with respect to certain other factors like heavy drinking, perceived health and life satisfaction. As it mentioned the data so used is secondary in nature, which is in its crude from having no sense to the user. But by a systematic execution of certain data mining tools, like correlation and MANOVA, certain important relationships along with ties were realized. Conclusions: The realizations were that all variables are strictly correlated with Karl Pearson correlation coefficient ranging from 0.73 to 0.99. In significant test all variables do not belie with alternative hypothesis, which means the association/ relationship is not zero. In MANOVA, the null hypothesis is rejected as the p-value is less than 0.05. Apart from this, most interestingly the variables are behaving like cohorts whereby resulting cohort effect.

Keywords

R Language, Rstudio, Secondary Data, Data Mining, Correlation, Manova
Subscription Login to verify subscription
User
Notifications
Font Size


Abstract Views: 384

PDF Views: 0




  • Data Mining and Analysis by R Language for Business Research: A Case Study on Stress and its Influence on Health

Abstract Views: 384  |  PDF Views: 0

Authors

Kamakshaiah Musunuru
Business Analytics, Dhruva College of Management, Hyderabad, India

Abstract


R is not only a statistical suite but also efficient data mining software for data manipulation, calculation and graphical display. In fact, R being a language also has an effective data handling and storage facility. Besides having a suite of operators for calculations on arrays, in particular matrices. The R is developed from a simple and effective programming language (called “S”) which includes conditionals, loops; user defined recursive functions and input and output facilities. Methods: In this paper the data mining capabilities of R has been explained with the help of a study on secondary data sources, obtained from certain authenticated sources. The study is all about to understand stress with respect to certain other factors like heavy drinking, perceived health and life satisfaction. As it mentioned the data so used is secondary in nature, which is in its crude from having no sense to the user. But by a systematic execution of certain data mining tools, like correlation and MANOVA, certain important relationships along with ties were realized. Conclusions: The realizations were that all variables are strictly correlated with Karl Pearson correlation coefficient ranging from 0.73 to 0.99. In significant test all variables do not belie with alternative hypothesis, which means the association/ relationship is not zero. In MANOVA, the null hypothesis is rejected as the p-value is less than 0.05. Apart from this, most interestingly the variables are behaving like cohorts whereby resulting cohort effect.

Keywords


R Language, Rstudio, Secondary Data, Data Mining, Correlation, Manova