Open Access
Subscription Access
Open Access
Subscription Access
Parallel Gene Selection Process Using Mapreduce for Microarray Data Classification
Subscribe/Renew Journal
Microarray technology is one of the vital tools that can monitor the expression levels of thousands of genes in a given organism. This technology is useful in the classification of cancer. One of the important issues in the classification of cancer microarray data is the selection of informative genes with high confidence from thousands of genes in the data that contributes to cancer. A dimensionality reduction method should eliminate genes that are irrelevant, redundant, or noisy for classification, while at the same time retain all the highly discriminative genes. In this paper, a novel method for gene selection based on mapreduce is proposed for improving the running time of the algorithm. The proposed approach analyzes cancer gene expression datasets, extract the most informative genes and classify the cancerous sample from normal samples using Support Vector Machine. The functioning of the gene selection algorithm is distributed through a set of mappers and reducers and thereby speeds up the classification process and reduce the memory requirements. The classifier model developed using support vector machine is used for evaluating the performance of the proposed gene selection approach. Simulation results show that the proposed approach has greater importance in clinical diagnosis and drug discovery for cancer with the ability to handle big collections of data providing a good accuracy and fast response times.
Keywords
Microarray, Gene Selection, Classification, Support Vector Machine, MapReduce
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 205
PDF Views: 1