Open Access
Subscription Access
Open Access
Subscription Access
Enrichment of Ensemble Learning using K-Modes Random Sampling
Subscribe/Renew Journal
Ensemble of classifiers combines the more than one prediction models of classifiers into single model for classifying the new instances. Unbiased samples could help the ensemble classifiers to build the efficient prediction model. Existing sampling techniques fails to give the unbiased samples. To overcome this problem, the paper introduces a k-modes random sample technique which combines the k-modes cluster algorithm and simple random sampling technique to take the sample from the dataset. In this paper, the impact of random sampling technique in the Ensemble learning algorithm is shown. Random selection was done properly by using k-modes random sampling technique. Hence, sample will reflect the characteristics of entire dataset.
Keywords
Sampling, Ensemble Classifiers, Cluster Random Sample.
Subscription
Login to verify subscription
User
Font Size
Information
- Lior Rokach, “Ensemble-based Classifiers”, Artificial Intelligence Review, Vol. 33, No. 1-2, pp. 1-39, 2010.
- Robi Polikar, “Ensemble based Systems in Decision Making”, IEEE Circuits and Systems Magazine, Vol. 6, No. 3, pp. 21-45, 2006.
- Robert E. Schapire, “The Strength of Weak Learnability”, Machine Learning, Vol. 5, No. 2, pp. 197-227, 1990.
- Jerome H. Friedman, “Greedy Function Approximation: A Gradient Boosting Machine”, Annals of Statistics, Vol. 29, No. 5, pp. 1189-1232, 2001.
- Tin Kam Ho, “Random Decision Forests”, Proceedings of 3rd International Conference on Document Analysis and Recognition, Vol. 1, pp. 1-6, 1995.
- Vrushali Y. Kulkarni and Pradeep K. Sinha, “Random Forest Classifiers: A Survey and Future Research Directions”, International Journal of Advanced Computer Technology, Vol. 36, No. 1, pp. 1144-1153, 2013.
- Leo Breiman, “Random Forests”, Machine Learning, Vol. 45, No. 1, pp. 5-32, 2001.
- M. Balamurugan and S. Kannan, “Analyse the Performance of Ensemble Classifiers using Sampling Techniques”, ICTACT Journal on Soft Computing, Vol. 6, No. 4, pp. 1293-1296, 2016.
- William G. Cochran, “Sampling Techniques”, John Wiley and Sons, 2007.
- Iain A. Macdonald, “Comparison of Sampling Techniques on the Performance of Monte-Carlo based Sensitivity Analysis”, Proceedings of 11th International Building Performance Simulation Association Conference, pp. 992-999, 2009.
- James D. Nelson and Robert C. Ward, “Statistical Considerations and Sampling Techniques for Ground‐Water Quality Monitoring”, Ground Water, Vol. 19, No. 6, pp. 617-626, 1981.
- Zhexue Huang, “Clustering Large Data Sets with Mixed Numeric and Categorical Values”, Proceedings of 1st Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 1-14, 1997.
- Zhexue Huang, “A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining”, Proceedings of International Workshop on Research Issues on Data Mining and Knowledge Discovery, pp. 1-8, 1997.
- Guojun Gan, Zijiang Yang and Jianhong Wu, “A Genetic K-Modes Algorithm for Clustering Categorical Data”, Proceedings of International Conference on Advanced Data Mining and Applications, pp. 195-202, 2005 [15] Rushi Longadge and Snehalata Dongre, “Class Imbalance Problem in Data Mining Review”, International Journal of Computer Science and Network, Vol. 2, No. 1, pp. 1-6, 2013.
Abstract Views: 246
PDF Views: 4