Open Access Open Access  Restricted Access Subscription Access

A Fuzzy Approach for Clustering Gene Expression Time Series Data


Affiliations
1 Examination Branch, Dibrugarh University, India
2 Centre for Computer Studies, Dibrugarh University, India
 

Identifying groups of genes that manifest similar expression patterns is crucial in the analysis of gene expression time series data. Choosing a similarity measure to determine the similarity or distance between profiles is an important task. Time series expression experiments are used to study a wide range of biological systems. More than 80% of all time series expression datasets are short (8 time points or fewer). These datasets present unique challenges. On account of the large number of genes profiled (often tens of thousands) and the small number of time points many patterns are expected to arise at random. Most clustering algorithms are unable to distinguish between real and random patterns. However, the shortness of gene expression time-series data limits the use of conventional statistical models and techniques for time-series analysis. To address this problem, this paper proposes the Fuzzy clustering algorithm based on short time-series, which is able to cluster profiles based on the similarity of their relative change of expression level and the corresponding temporal in- formation. One of the major advantages of fuzzy clustering is that genes can belong to more than one group, revealing distinctive features of each gene's function and regulation.

Keywords

Fuzzy Clustering, Short Time Series, Gene Expression.
User
Notifications
Font Size

Abstract Views: 401

PDF Views: 151




  • A Fuzzy Approach for Clustering Gene Expression Time Series Data

Abstract Views: 401  |  PDF Views: 151

Authors

Sadiq Hussain
Examination Branch, Dibrugarh University, India
G. C. Hazarika
Centre for Computer Studies, Dibrugarh University, India

Abstract


Identifying groups of genes that manifest similar expression patterns is crucial in the analysis of gene expression time series data. Choosing a similarity measure to determine the similarity or distance between profiles is an important task. Time series expression experiments are used to study a wide range of biological systems. More than 80% of all time series expression datasets are short (8 time points or fewer). These datasets present unique challenges. On account of the large number of genes profiled (often tens of thousands) and the small number of time points many patterns are expected to arise at random. Most clustering algorithms are unable to distinguish between real and random patterns. However, the shortness of gene expression time-series data limits the use of conventional statistical models and techniques for time-series analysis. To address this problem, this paper proposes the Fuzzy clustering algorithm based on short time-series, which is able to cluster profiles based on the similarity of their relative change of expression level and the corresponding temporal in- formation. One of the major advantages of fuzzy clustering is that genes can belong to more than one group, revealing distinctive features of each gene's function and regulation.

Keywords


Fuzzy Clustering, Short Time Series, Gene Expression.