Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Enrichment of Ensemble Learning using K-Modes Random Sampling


Affiliations
1 Department of Computer Applications, Madurai Kamaraj University, India
     

   Subscribe/Renew Journal


Ensemble of classifiers combines the more than one prediction models of classifiers into single model for classifying the new instances. Unbiased samples could help the ensemble classifiers to build the efficient prediction model. Existing sampling techniques fails to give the unbiased samples. To overcome this problem, the paper introduces a k-modes random sample technique which combines the k-modes cluster algorithm and simple random sampling technique to take the sample from the dataset. In this paper, the impact of random sampling technique in the Ensemble learning algorithm is shown. Random selection was done properly by using k-modes random sampling technique. Hence, sample will reflect the characteristics of entire dataset.

Keywords

Sampling, Ensemble Classifiers, Cluster Random Sample.
Subscription Login to verify subscription
User
Notifications
Font Size

  • Lior Rokach, “Ensemble-based Classifiers”, Artificial Intelligence Review, Vol. 33, No. 1-2, pp. 1-39, 2010.
  • Robi Polikar, “Ensemble based Systems in Decision Making”, IEEE Circuits and Systems Magazine, Vol. 6, No. 3, pp. 21-45, 2006.
  • Robert E. Schapire, “The Strength of Weak Learnability”, Machine Learning, Vol. 5, No. 2, pp. 197-227, 1990.
  • Jerome H. Friedman, “Greedy Function Approximation: A Gradient Boosting Machine”, Annals of Statistics, Vol. 29, No. 5, pp. 1189-1232, 2001.
  • Tin Kam Ho, “Random Decision Forests”, Proceedings of 3rd International Conference on Document Analysis and Recognition, Vol. 1, pp. 1-6, 1995.
  • Vrushali Y. Kulkarni and Pradeep K. Sinha, “Random Forest Classifiers: A Survey and Future Research Directions”, International Journal of Advanced Computer Technology, Vol. 36, No. 1, pp. 1144-1153, 2013.
  • Leo Breiman, “Random Forests”, Machine Learning, Vol. 45, No. 1, pp. 5-32, 2001.
  • M. Balamurugan and S. Kannan, “Analyse the Performance of Ensemble Classifiers using Sampling Techniques”, ICTACT Journal on Soft Computing, Vol. 6, No. 4, pp. 1293-1296, 2016.
  • William G. Cochran, “Sampling Techniques”, John Wiley and Sons, 2007.
  • Iain A. Macdonald, “Comparison of Sampling Techniques on the Performance of Monte-Carlo based Sensitivity Analysis”, Proceedings of 11th International Building Performance Simulation Association Conference, pp. 992-999, 2009.
  • James D. Nelson and Robert C. Ward, “Statistical Considerations and Sampling Techniques for Ground‐Water Quality Monitoring”, Ground Water, Vol. 19, No. 6, pp. 617-626, 1981.
  • Zhexue Huang, “Clustering Large Data Sets with Mixed Numeric and Categorical Values”, Proceedings of 1st Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 1-14, 1997.
  • Zhexue Huang, “A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining”, Proceedings of International Workshop on Research Issues on Data Mining and Knowledge Discovery, pp. 1-8, 1997.
  • Guojun Gan, Zijiang Yang and Jianhong Wu, “A Genetic K-Modes Algorithm for Clustering Categorical Data”, Proceedings of International Conference on Advanced Data Mining and Applications, pp. 195-202, 2005 [15] Rushi Longadge and Snehalata Dongre, “Class Imbalance Problem in Data Mining Review”, International Journal of Computer Science and Network, Vol. 2, No. 1, pp. 1-6, 2013.

Abstract Views: 246

PDF Views: 4




  • Enrichment of Ensemble Learning using K-Modes Random Sampling

Abstract Views: 246  |  PDF Views: 4

Authors

Balamurugan Mahalingam
Department of Computer Applications, Madurai Kamaraj University, India
S. Kannan
Department of Computer Applications, Madurai Kamaraj University, India
Vairaprakash Gurusamy
Department of Computer Applications, Madurai Kamaraj University, India

Abstract


Ensemble of classifiers combines the more than one prediction models of classifiers into single model for classifying the new instances. Unbiased samples could help the ensemble classifiers to build the efficient prediction model. Existing sampling techniques fails to give the unbiased samples. To overcome this problem, the paper introduces a k-modes random sample technique which combines the k-modes cluster algorithm and simple random sampling technique to take the sample from the dataset. In this paper, the impact of random sampling technique in the Ensemble learning algorithm is shown. Random selection was done properly by using k-modes random sampling technique. Hence, sample will reflect the characteristics of entire dataset.

Keywords


Sampling, Ensemble Classifiers, Cluster Random Sample.

References