Open Access Open Access  Restricted Access Subscription Access

Optimizing the Prediction of Bagging and Boosting


Affiliations
1 Department of Computer Science, Vijaya College, Jayanagar, Bangalore - 560011, Karnataka, India
2 Department of Computer Applications, DG Vaishnav College, Chennai - 600106, Tamil Nadu, India
 

Background/Objectives: Since more than a decade, Ensemble methods like Bagging and Boosting have drawn great attention by the researchers aiming to improve the prediction accuracy over single classifiers Despite Some recent studies have noticed that Bagging and Boosting does not always improve the accuracy, it enhances the accuracy only if the classifier is unstable classifier. To overcome this problem, a Hybrid Ensemble Model with two phases of preprocessing is proposed in this paper and evaluated using 9 classifiers on 3 benchmark data sets of UCI Repository. Methods: In the first phase of preprocessing feature selection is performed using CFS to select the attributes highly correlated to the class and in the second phase K-means clustering algorithm is applied to remove the noisy instances. Finally, the resultant instances from the previous stages are trained with Bagging and Boosting ensembles to build the final Hybrid Ensemble classifier Model (HECM) using 10 fold cross validation. The result was evaluated using the confusion matrix and the performance measures like accuracy, kappa, mean absolute error and time to build the model. Findings: Results proved that proposed model is more efficient than the existing models and showed improved accuracy for both stable and unstable classifier ranging from 2% to 30.14% over traditional ensemble model depending upon the complexity of the algorithm.

Keywords

Bagging, Boosting, Classification, Correlation Based Feature Selection (CFS), Hybrid, K-Means
User

Abstract Views: 166

PDF Views: 0




  • Optimizing the Prediction of Bagging and Boosting

Abstract Views: 166  |  PDF Views: 0

Authors

B. V. Sumana
Department of Computer Science, Vijaya College, Jayanagar, Bangalore - 560011, Karnataka, India
T. Santhanam
Department of Computer Applications, DG Vaishnav College, Chennai - 600106, Tamil Nadu, India

Abstract


Background/Objectives: Since more than a decade, Ensemble methods like Bagging and Boosting have drawn great attention by the researchers aiming to improve the prediction accuracy over single classifiers Despite Some recent studies have noticed that Bagging and Boosting does not always improve the accuracy, it enhances the accuracy only if the classifier is unstable classifier. To overcome this problem, a Hybrid Ensemble Model with two phases of preprocessing is proposed in this paper and evaluated using 9 classifiers on 3 benchmark data sets of UCI Repository. Methods: In the first phase of preprocessing feature selection is performed using CFS to select the attributes highly correlated to the class and in the second phase K-means clustering algorithm is applied to remove the noisy instances. Finally, the resultant instances from the previous stages are trained with Bagging and Boosting ensembles to build the final Hybrid Ensemble classifier Model (HECM) using 10 fold cross validation. The result was evaluated using the confusion matrix and the performance measures like accuracy, kappa, mean absolute error and time to build the model. Findings: Results proved that proposed model is more efficient than the existing models and showed improved accuracy for both stable and unstable classifier ranging from 2% to 30.14% over traditional ensemble model depending upon the complexity of the algorithm.

Keywords


Bagging, Boosting, Classification, Correlation Based Feature Selection (CFS), Hybrid, K-Means



DOI: https://doi.org/10.17485/ijst%2F2015%2Fv8i35%2F124565