![Open Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltextgreen.png)
![Restricted Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltextred.png)
![Open Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltextgreen.png)
![Open Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltext_open_medium.gif)
![Restricted Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltextred.png)
![Restricted Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltext_restricted_medium.gif)
An Archetype Match Algorithm Using Parser Based Search & Production Rules
Subscribe/Renew Journal
Pattern matching Algorithm is presented here in two phases. The first phase is preprocessing in which an n-ary tree like Structure constructed for the given text data and Griebach normal form is created using context free grammar. The second phase called as search phase takes as input an n-ary tree structure and Griebach normal form of given context free grammar constructed in phase 1 and outputs those strings that match both the text data and context free grammar. There are two common ways to describe how a given string can be derived from the start symbol of a given grammar. The simplest way is to list the consecutive strings of symbols, beginning with the start symbol and ending with the string, and the rules that have been applied. In computer science, to say that a context-free grammar is in Greibach normal form (GNF) means that all production rules are of the form:
A->cX
or
S->Z
Where A is a non terminal symbol, c is a terminal symbol, X is a (possibly empty) sequence of non terminal symbols not including the start symbol, S is the start symbol, and Z is the null string. The algorithm proposed has the advantage of multiple string searches at random. It finds wide applications in Bioinformatics where one need to contrast and compare long sequences of human genome sequences for the earlier detection of genetic diseases and drug discovery.
A->cX
or
S->Z
Where A is a non terminal symbol, c is a terminal symbol, X is a (possibly empty) sequence of non terminal symbols not including the start symbol, S is the start symbol, and Z is the null string. The algorithm proposed has the advantage of multiple string searches at random. It finds wide applications in Bioinformatics where one need to contrast and compare long sequences of human genome sequences for the earlier detection of genetic diseases and drug discovery.
Keywords
Context Free Grammar, Genomics, Griebach Normal Form, N-Ary Tree Structure, Pattern Matching, Preprocessing.
User
Subscription
Login to verify subscription
Font Size
Information
![](https://i-scholar.in/public/site/images/abstractview.png)
Abstract Views: 287
![](https://i-scholar.in/public/site/images/pdfview.png)
PDF Views: 2