Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Improving the Lead Conversion Rate for Online Course Selection Using Binary Classification Approach


Affiliations
1 Department of Information Science, JSS Academy of Technical Education, Bangalore, India
     

   Subscribe/Renew Journal


The paper discusses the binary classification problem using a unique data set on online education. ‗Lead Scoring X Online Education‘ data set is chosen from Kaggle repository. A lead is a user who is likely to select a course, identified by the company named X based on their analysis of the user activity on the website. The percentage of the registered users who end up selecting a course is referred to as the lead conversion rate. The current lead conversion rate of the company is low. To address this issue, a model is formed that studies the activities of the users who previously ended up selecting a course and identifies the potential leads by predicting whether a new user selects a course or not. This helps the company to target only the potential leads rather than communicating with all the registered users. The methodology is proposed to show that the lead conversion rate is improved using logistic regression, SVM, and Random Forest classification algorithms. From the experiments, it is found that the performance of all the algorithms is quite accurate.

Keywords

Data Preprocessing, Data Mining, Supervised Learning, Binary Classification
User
Subscription Login to verify subscription
Notifications
Font Size

  • Friedman, J.H., 1998. Data Mining and Statistics: What's the connection? Computing science and statistics, 29(1), pp.3-9.
  • Weiss, G.M. and Davison, B.D., 2010. Data mining. In to appear in the handbook of technology management, h. bidgoli (ed.).
  • Yang, Q. and Wu, X., 2006. 10 challenging problems in data mining research. International Journal of Information Technology & Decision Making, 5(04), pp.597-604.
  • Cios, K.J. and Kurgan, L.A., 2005. Trends in data mining and knowledge discovery. In Advanced techniques in knowledge discovery and data mining (pp. 1-26). Springer, London.
  • Agarwal, S., 2013, December. Data mining: Data mining concepts and techniques. In 2013 International Conference on Machine Intelligence and Research Advancement (pp. 203-207). IEEE
  • Kaiser, J., 2014. Dealing with missing values in data. Journal of systems integration, 5(1), pp.42-51.
  • Data preprocessing in data mining- S García, J Luengo, F Herrera - 2015 - Springer
  • Data Mining Concepts and Techniques(Third Edition)-Jiawei Han, Micheline Kamber, Jian Pei.
  • Malik, J.S., Goyal, P. and Sharma, A.K., 2010. A comprehensive approach towards data preprocessing techniques & association rules. In Proceedings of the 4th National Conference (Vol. 132).
  • Kotsiantis, S.B., Kanellopoulos, D. and Pintelas, P.E., 2006. Data preprocessing for supervised leaning. International Journal of Computer Science, 1(2), pp.111-117.
  • Alexandropoulos, S.A.N., Kotsiantis, S.B. and Vrahatis, M.N., 2019. Data preprocessing in predictive data mining. The Knowledge Engineering Review, 34.
  • Nesterov, S.A. and Smolina, E.M., 2020. The assessment of the results of a massive open online course using Data Mining methods. 13(1).
  • Nayak, J., Naik, B. and Behera, H., 2015. A comprehensive survey on support vector machine in data mining tasks: applications & challenges.International Journal of Database Theory and Application, 8(1), pp.169- 186.
  • Hardman, J., Paucar‐ Caceres, A. and Fielding, A., 2013. Predicting Students' Progression in Higher Education by Using the Random Forest Algorithm. Systems Research and Behavioral Science, 30(2), pp.194- 203.
  • Gorade, S.M., Deo, A. and Purohit, P., 2017. A study of some data mining classification techniques. International Research J. of Engineering and Technology (IRJET), 4
  • W.-K. Chen, Linear Networks and Systems (Book style). Belmont, CA: Wadsworth, 1993, pp. 123–135.
  • H. Poor, An Introduction to Signal Detection and Estimation. New York: Springer-Verlag, 1985, ch. 4.
  • B. Smith, ―An approach to graphs of linear forms (Unpublished work style),‖ unpublished.
  • E. H. Miller, ―A note on reflector arrays (Periodical style—Accepted for publication),‖ IEEE Trans. Antennas Propagat., to be published.
  • J. Wang, ―Fundamentals of erbium-doped fiber amplifiers arrays (Periodical style—Submitted for publication),‖ IEEE J. Quantum Electron., submitted for publication.
  • C. J. Kaufman, Rocky Mountain Research Lab., Boulder, CO, private communication, May 1995.
  • Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, ―Electron spectroscopy studies on magneto-optical media and plastic substrate interfaces. IEEE Transl. J. Magn.Jpn., vol. 2, Aug. 1987, pp. 740–741.

Abstract Views: 134

PDF Views: 0




  • Improving the Lead Conversion Rate for Online Course Selection Using Binary Classification Approach

Abstract Views: 134  |  PDF Views: 0

Authors

Venkatesh R Pai
Department of Information Science, JSS Academy of Technical Education, Bangalore, India
Malini M Patil
Department of Information Science, JSS Academy of Technical Education, Bangalore, India

Abstract


The paper discusses the binary classification problem using a unique data set on online education. ‗Lead Scoring X Online Education‘ data set is chosen from Kaggle repository. A lead is a user who is likely to select a course, identified by the company named X based on their analysis of the user activity on the website. The percentage of the registered users who end up selecting a course is referred to as the lead conversion rate. The current lead conversion rate of the company is low. To address this issue, a model is formed that studies the activities of the users who previously ended up selecting a course and identifies the potential leads by predicting whether a new user selects a course or not. This helps the company to target only the potential leads rather than communicating with all the registered users. The methodology is proposed to show that the lead conversion rate is improved using logistic regression, SVM, and Random Forest classification algorithms. From the experiments, it is found that the performance of all the algorithms is quite accurate.

Keywords


Data Preprocessing, Data Mining, Supervised Learning, Binary Classification

References