Open Access Open Access  Restricted Access Subscription Access

A Hybrid Approach for Simultaneous Gene Clustering and Gene Selection for Pattern Classification


Affiliations
1 Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan University, Bhubaneswar - 751030, Odisha, India
 

Objectives: This study proposes a hybrid model of simultaneous gene clustering and gene selection for gene expression datasets using hierarchical clustering and rough set theory for classification of data patterns. Methods/Analysis: The internal architecture of the proposed model broadly works in three phases, in first phase; the initial clusters are formed using hierarchical clustering and again those resulted clusters are divided into more clusters using based on lower and upper approximation property of rough set theory. In second phase; the reduct property of rough set is applied on obtained clusters from the second phase; and in third phase, the gene ranking and cluster ranking has been employed to rank the genes in clusters to discover significant of informative genes. This method tries to find the genes of interest known as significant genes and maximize the accuracy of the model with reduction percentage. The advantage of this approach is analyzed by experimental results on two benchmark datasets such as Leukemia and Colon Cancer. Finally, the classification performance of the original datasets were recorded using Support Vector Machine (SVM) classifier and also with few existing feature/gene selection and clustering techniques. Findings: The experimental results and performance measures proves the efficiency of the proposed hybridized technique over existing feature/gene selection as well as established traditional k-means clustering technique.

Keywords

Gene Selection, Hierarchical Clustering, Lower Approximation, Reduct, Rough Set Theory, Upper Approximation.
User

Abstract Views: 189

PDF Views: 0




  • A Hybrid Approach for Simultaneous Gene Clustering and Gene Selection for Pattern Classification

Abstract Views: 189  |  PDF Views: 0

Authors

Pradeep Kumar Mallick
Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan University, Bhubaneswar - 751030, Odisha, India
Debahuti Mishra
Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan University, Bhubaneswar - 751030, Odisha, India
Srikanta Patnaik
Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan University, Bhubaneswar - 751030, Odisha, India
Kailash Shaw
Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan University, Bhubaneswar - 751030, Odisha, India

Abstract


Objectives: This study proposes a hybrid model of simultaneous gene clustering and gene selection for gene expression datasets using hierarchical clustering and rough set theory for classification of data patterns. Methods/Analysis: The internal architecture of the proposed model broadly works in three phases, in first phase; the initial clusters are formed using hierarchical clustering and again those resulted clusters are divided into more clusters using based on lower and upper approximation property of rough set theory. In second phase; the reduct property of rough set is applied on obtained clusters from the second phase; and in third phase, the gene ranking and cluster ranking has been employed to rank the genes in clusters to discover significant of informative genes. This method tries to find the genes of interest known as significant genes and maximize the accuracy of the model with reduction percentage. The advantage of this approach is analyzed by experimental results on two benchmark datasets such as Leukemia and Colon Cancer. Finally, the classification performance of the original datasets were recorded using Support Vector Machine (SVM) classifier and also with few existing feature/gene selection and clustering techniques. Findings: The experimental results and performance measures proves the efficiency of the proposed hybridized technique over existing feature/gene selection as well as established traditional k-means clustering technique.

Keywords


Gene Selection, Hierarchical Clustering, Lower Approximation, Reduct, Rough Set Theory, Upper Approximation.



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i21%2F133975