Open Access Open Access  Restricted Access Subscription Access

On Feature Selection Algorithms and Feature Selection Stability Measures:A Comparative Analysis


Affiliations
1 Dept. of Computer Science, Hindustan College of Arts and Science, Chennai - 603 103, India
2 Dept. of Computer Applications, Madurai Kamaraj University, Madurai – 625 021, India
 

Data mining is indispensable for business organizations for extracting useful information from the huge volume of stored data which can be used in managerial decision making to survive in the competition. Due to the day-to-day advancements in information and communication technology, these data collected from e-commerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets selected in subsequent iterations by feature selection should be same or similar even in case of small perturbations of the dataset and is called as selection stability. It is recently becomes important topic of research community. The selection stability has been measured by various measures. This paper analyses the selection of the suitable search method and stability measure for the feature selection algorithms and also the influence of the characteristics of the dataset as the choice of the best approach is highly problem dependent.

Keywords

Data Mining, Feature Selection, Feature Selection Algorithms, Selection Stability, Stability Measures.
User
Notifications
Font Size

  • Salem Alelyani, Huan Liu, “The Effect of the Characteristics of the Dataset on the Selection Stability”, 1082-3409/11, IEEE DOI 10.1109/ International Conference on Tools with Artificial Intelligence.2011.167, 2011.
  • K. Mani, P. Kalpana, “A review on filter based feature selection”, International Journal of Innovative Research in Computer and Communication Engineering (IJIRCCE) ISSN: 2320-9801, Vol. 4, Issue 5, May 2016
  • K. Sudha, J. JebamalarTamilselvi, "A Review of Feature Selection Algorithms for Data Mining Techniques", International Journal on Computer Science and Engineering (IJCSE) ISSN: 0975-3397, Vol. 7, No.6, pp. 63-67, June 2015.
  • Holte, R.C., “Very simple classification rules perform well on most commonly used datasets”, Machine Learning, 11: 63-91, 1993.
  • Hall, M.A., and Smith, L. A., “Practical feature subset selection for machine learning”, Proceedings of the 21st Australian Computer Science Conference, 181–191, 1998.
  • Mark A. Hall, ”Correlation-based Feature Selection for Machine Learning”, Dept. of Computer science, University of Waikato. http://www.cs.waikato.ac.nz/ mhall / thesis.pdf, 1998.
  • Marko, R.S., and Igor, K., “Theoretical and empirical analysis of relief and reliefF”, Machine Learning Journal, doi: 10.1023/A: 1025667309714, 53: 23–69, 2003.
  • I. Witten and E. Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann Pub, 2005.
  • R. Duda, P. Hart, and D. Stork. Pattern Classification. John Wiley & Sons, New York, 2 edition, 2001.
  • A. Kalousis, J. Prados, and M. Hilario, “Stability of feature selection algorithms: a study on high-dimensional spaces”, Knowledge and Information Systems, 12(1):95 – 116, May 2007.
  • L. I. Kuncheva, “A stability index for feature selection”, In Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi Conference: artificial intelligence and applications, Anaheim, CA, USA,. ACTA Press, 390 – 395, 2007.
  • Salem Alelyani, Zheng Zhao, Huan Liu, “A Dilemma in Assessing Stability of Feature Selection Algorithms”, 978-0-7695-4538-7/11, IEEE DOI 10.1109/ International Conference on High Performance Computing and Communications. 2011.99, 2011.
  • Y. Saeys, T. Abeel, and Y. Van de Peer, Robust feature selection using ensemble feature selection techniques, 2008.
  • J. Alcala-Fdez, A. Fernandez, J. Luengo, J. Derrac, S. García,L.Sanchez, and F. Herrera, “KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework”, J. Multiple-Valued Logic Soft Comput., 17(2): 255–287, 2010.

Abstract Views: 334

PDF Views: 162




  • On Feature Selection Algorithms and Feature Selection Stability Measures:A Comparative Analysis

Abstract Views: 334  |  PDF Views: 162

Authors

P. Mohana Chelvan
Dept. of Computer Science, Hindustan College of Arts and Science, Chennai - 603 103, India
K. Perumal
Dept. of Computer Applications, Madurai Kamaraj University, Madurai – 625 021, India

Abstract


Data mining is indispensable for business organizations for extracting useful information from the huge volume of stored data which can be used in managerial decision making to survive in the competition. Due to the day-to-day advancements in information and communication technology, these data collected from e-commerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets selected in subsequent iterations by feature selection should be same or similar even in case of small perturbations of the dataset and is called as selection stability. It is recently becomes important topic of research community. The selection stability has been measured by various measures. This paper analyses the selection of the suitable search method and stability measure for the feature selection algorithms and also the influence of the characteristics of the dataset as the choice of the best approach is highly problem dependent.

Keywords


Data Mining, Feature Selection, Feature Selection Algorithms, Selection Stability, Stability Measures.

References