Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

An Archetype Match Algorithm Using Parser Based Search & Production Rules


Affiliations
1 Birla Institute of Technology, Mesra, Ranchi-835215 (JH), India
     

   Subscribe/Renew Journal


Pattern matching Algorithm is presented here in two phases. The first phase is preprocessing in which an n-ary tree like Structure constructed for the given text data and Griebach normal form is created using context free grammar. The second phase called as search phase takes as input an n-ary tree structure and Griebach normal form of given context free grammar constructed in phase 1 and outputs those strings that match both the text data and context free grammar. There are two common ways to describe how a given string can be derived from the start symbol of a given grammar. The simplest way is to list the consecutive strings of symbols, beginning with the start symbol and ending with the string, and the rules that have been applied. In computer science, to say that a context-free grammar is in Greibach normal form (GNF) means that all production rules are of the form:

A->cX
or
S->Z

Where A is a non terminal symbol, c is a terminal symbol, X is a (possibly empty) sequence of non terminal symbols not including the start symbol, S is the start symbol, and Z is the null string. The algorithm proposed has the advantage of multiple string searches at random. It finds wide applications in Bioinformatics where one need to contrast and compare long sequences of human genome sequences for the earlier detection of genetic diseases and drug discovery.

Keywords

Context Free Grammar, Genomics, Griebach Normal Form, N-Ary Tree Structure, Pattern Matching, Preprocessing.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 197

PDF Views: 2




  • An Archetype Match Algorithm Using Parser Based Search & Production Rules

Abstract Views: 197  |  PDF Views: 2

Authors

K. K. Senapati
Birla Institute of Technology, Mesra, Ranchi-835215 (JH), India
Rahul Verma
Birla Institute of Technology, Mesra, Ranchi-835215 (JH), India

Abstract


Pattern matching Algorithm is presented here in two phases. The first phase is preprocessing in which an n-ary tree like Structure constructed for the given text data and Griebach normal form is created using context free grammar. The second phase called as search phase takes as input an n-ary tree structure and Griebach normal form of given context free grammar constructed in phase 1 and outputs those strings that match both the text data and context free grammar. There are two common ways to describe how a given string can be derived from the start symbol of a given grammar. The simplest way is to list the consecutive strings of symbols, beginning with the start symbol and ending with the string, and the rules that have been applied. In computer science, to say that a context-free grammar is in Greibach normal form (GNF) means that all production rules are of the form:

A->cX
or
S->Z

Where A is a non terminal symbol, c is a terminal symbol, X is a (possibly empty) sequence of non terminal symbols not including the start symbol, S is the start symbol, and Z is the null string. The algorithm proposed has the advantage of multiple string searches at random. It finds wide applications in Bioinformatics where one need to contrast and compare long sequences of human genome sequences for the earlier detection of genetic diseases and drug discovery.

Keywords


Context Free Grammar, Genomics, Griebach Normal Form, N-Ary Tree Structure, Pattern Matching, Preprocessing.