Open Access
Subscription Access
Performance Analysis of Regression and Classification Models in the Prediction of Breast Cancer
Objective: To suggest an automated diagnostic system for the early detection of breast cancer. Methods: This problem has been addressed by making use of machine learning algorithms that can accurately classify a tumor as either malignant or benign by identifying the minimum number of image features. A comparative study on various classification approaches such as Decision Tree, Support Vector Machine, K-Nearest Neighbor and Random Forest have also been conducted with a focus on cross validation to identify the best performing model. Findings: The study shows that Random Forest classifier gives the maximum accuracy. It also highlights that cross validation and fine tuning are necessary to prevent over fitting of data. Improvements: It has been observed that the selection of parameters play a very important role in correct classification as multicollinearity among attributes can render classifier models ineffective.
Keywords
Breast Cancer, Classification, Cross Validation, Decision Tree, K-Nearest Neighbor, Logistic Regression, Random Forest, Support Vector Machine
User
Information
Abstract Views: 189
PDF Views: 0