Open Access
Subscription Access
Open Access
Subscription Access
Optimizing Classification of High Dimensional Data by Hybrid Approach of Feature Selection with Wrapper Evaluators
Subscribe/Renew Journal
High dimensional data contains large number of features (predictor attributes) compared to number of samples. As many of these features are irrelevant with class label, if any classification algorithm is directly applied on this dataset then model come out will be less accurate and will take much time for building, testing and applying on unseen data. Feature selection methods will select only those features which are relevant to class label. During feature selection procedure, set of features are generated and evaluated for its relevance with class. There are several methods proposed in literature for generation and evaluation of features. Each method has its own characteristic. In this paper experiment is carried out on three types of cancer gene expression datasets with different feature selection methods. Features are generated by ranker, heuristic and random search methods while they are evaluated by information gain, attreval and wrapper methods. A hybrid approach which combines ranker and subset based feature generation is also proposed. It shows that hybrid approach with wrapper evaluator gives best classification accuracy.
Keywords
Data Mining, Classification, Feature Selection, Wrapper Evaluators.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 219
PDF Views: 2