Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Effective Feature Selection Method for Cervical Cancer Dataset Using Data Mining Classification Analytical Model


Affiliations
1 Department of Computer Science, Nandha Arts and Science College, Erode, Tamilnadu, India
     

   Subscribe/Renew Journal


Data mining is a set of techniques which could be used to derive hidden patterns from the data. The purpose of data mining is to find some information which is not directly visible or retrievable by reading data or executing simple queries to the data.  One  of the  key features  of using  data mining techniques is  to predict  future based  on the  data of  past  and  present. Predictions are widely required to be done for betterment of future. An accurate and timely prediction could avoid any future issue at a certain level. Healthcare is a field where it is required to diagnosis various critical diseases like cancers before they become life threatening. This paper explains how data mining techniques could be useful for healthcare purpose specially to predict possibility of a patient suffering from cervical cancer. The main goal here is to design a database which can be used in future for data mining purpose. In this paper implemented a feature model construction and comparative analysis for improving prediction accuracy of cervical cancer patients in four phases. In first phase, min-max normalization algorithm is applied on the original cervical cancer patient datasets collected from UCI repository. In cervical cancer dataset prediction second phase, by the use of feature selection, subset (data) of cervical cancer patient dataset from whole normalized cervical cancer patient datasets is obtained which comprises only significant attributes.  Third phase, classification algorithms are applied on the data set. In the fourth phase, the accuracy will be calculated using ischolar_main mean square value, ischolar_main mean error value. KNN and SVM algorithm is considered as the better performance algorithm after applying feature selection. Finally, the evaluation is done based on accuracy values. Thus outputs shows from proposed GA base feature extraction with classification model implementations indicate that KNN and SVM algorithm performances all other classification algorithm with the help of feature selection with an accuracy of 97.60%.


Keywords

Cervical Cancer dataset, Data Mining Algorithm, KNN, SVM
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 226

PDF Views: 1




  • Effective Feature Selection Method for Cervical Cancer Dataset Using Data Mining Classification Analytical Model

Abstract Views: 226  |  PDF Views: 1

Authors

Dr. D. Rajakumari
Department of Computer Science, Nandha Arts and Science College, Erode, Tamilnadu, India
S. Karthika
Department of Computer Science, Nandha Arts and Science College, Erode, Tamilnadu, India

Abstract


Data mining is a set of techniques which could be used to derive hidden patterns from the data. The purpose of data mining is to find some information which is not directly visible or retrievable by reading data or executing simple queries to the data.  One  of the  key features  of using  data mining techniques is  to predict  future based  on the  data of  past  and  present. Predictions are widely required to be done for betterment of future. An accurate and timely prediction could avoid any future issue at a certain level. Healthcare is a field where it is required to diagnosis various critical diseases like cancers before they become life threatening. This paper explains how data mining techniques could be useful for healthcare purpose specially to predict possibility of a patient suffering from cervical cancer. The main goal here is to design a database which can be used in future for data mining purpose. In this paper implemented a feature model construction and comparative analysis for improving prediction accuracy of cervical cancer patients in four phases. In first phase, min-max normalization algorithm is applied on the original cervical cancer patient datasets collected from UCI repository. In cervical cancer dataset prediction second phase, by the use of feature selection, subset (data) of cervical cancer patient dataset from whole normalized cervical cancer patient datasets is obtained which comprises only significant attributes.  Third phase, classification algorithms are applied on the data set. In the fourth phase, the accuracy will be calculated using ischolar_main mean square value, ischolar_main mean error value. KNN and SVM algorithm is considered as the better performance algorithm after applying feature selection. Finally, the evaluation is done based on accuracy values. Thus outputs shows from proposed GA base feature extraction with classification model implementations indicate that KNN and SVM algorithm performances all other classification algorithm with the help of feature selection with an accuracy of 97.60%.


Keywords


Cervical Cancer dataset, Data Mining Algorithm, KNN, SVM