Analysis of Biological Sequence by Data Mining

N. Senthil Vel Murugan; V. Vallinayagam; K. Senthamarai Kannan

Analysis of Biological Sequence by Data Mining

N. Senthil Vel Murugan ¹, V. Vallinayagam ², K. Senthamarai Kannan ³

Affiliations
1 Department of Mathematics, Rohini College of Engineering and Technology, Kanyakumari Dist, 629401, India
2 Department of Mathematics, St. Joseph's College of Engineering, Chennai-600119, India
3 Department of Statistics, Manonmaniam Sundaranar University, Tirunelveli-627012, India

Data mining allows users to discover novelty in huge amounts of data. The recent studies have used individual structures for study while this study focuses on sequential pattern mining. This study attempts to study sequential patterns extracted from gene data. The data for the present study were collected from the Gen Bank. The data taken for study is DNA sequence of samples affected by Liver cancer. It can be inferred from the analysis that increases or decrease in protein level, hormone level contributes to Lever cancer. The aim of this paper is analyze the above liver cancer DNA sequence data and reduce the variable size by Principal Component Analysis and Singular value decomposition technique and which proteins will affect quickly as possible using Similarity techniques. The reasonable results verify the validity of our method.