Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Scalable Frequent Pattern Mining of Biological Sequences


Affiliations
1 Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India
     

   Subscribe/Renew Journal


In the field of Bioinformatics, Frequent pattern mining is the challenging task of mining interesting or hidden information from DNA or Protein sequences. Frequent Pattern Mining is preferable for expressing the function and structure of two or more protein and DNA sequences. Significant numbers of algorithms have been proposed in finding frequently ordered arrangements of frequent patterns that are similar expression (Motif) of a group of genes. Unfortunately, the computation and the memory cost of the algorithms are expensive, when the size of the database is huge. The objective of this paper is to speed up the mining process in terms of time using dynamic parallelism. This paper proposes an algorithm, Scalable Frequent Pattern Mining with Constraints by parallelizing the Frequent Pattern Tree Growth using OpenMP. Experiments have been conducted based on various real time biological datasets which shows that the proposed algorithm outperforms the existing algorithms.

Keywords

Frequent Pattern Tree, Pattern Mining, Scalability, Constraints.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 243

PDF Views: 2




  • Scalable Frequent Pattern Mining of Biological Sequences

Abstract Views: 243  |  PDF Views: 2

Authors

E. Ramanujam
Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India

Abstract


In the field of Bioinformatics, Frequent pattern mining is the challenging task of mining interesting or hidden information from DNA or Protein sequences. Frequent Pattern Mining is preferable for expressing the function and structure of two or more protein and DNA sequences. Significant numbers of algorithms have been proposed in finding frequently ordered arrangements of frequent patterns that are similar expression (Motif) of a group of genes. Unfortunately, the computation and the memory cost of the algorithms are expensive, when the size of the database is huge. The objective of this paper is to speed up the mining process in terms of time using dynamic parallelism. This paper proposes an algorithm, Scalable Frequent Pattern Mining with Constraints by parallelizing the Frequent Pattern Tree Growth using OpenMP. Experiments have been conducted based on various real time biological datasets which shows that the proposed algorithm outperforms the existing algorithms.

Keywords


Frequent Pattern Tree, Pattern Mining, Scalability, Constraints.