![Open Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltextgreen.png)
![Restricted Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltextred.png)
![Open Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltextgreen.png)
![Open Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltext_open_medium.gif)
![Restricted Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltextred.png)
![Restricted Access](https://i-scholar.in/lib/pkp/templates/images/icons/fulltext_restricted_medium.gif)
A Probabilistic Smoothing Approach for Language Models Applied to Protein Sequence Data
Subscribe/Renew Journal
Most modern techniques for statistical processing of language modeling are widely applied to many domains such as Speech recognition, Machine translation and Information Retrieval etc. The basic idea behind the language model is probabilistic, which describes the task of probability estimation defined over strings frequently designed as a sentence. One of the core problem addresses a language model is termed as smoothing, its primitive goal is to improve the model accuracy by ajusting the maximum likelihood estimate of probabilities. To retrieve this challenge, the paper focuses a well-known smoothing technique called Good-Turing, applied over a bioinformatics task of protein sequence. Also, the computational procedure of this technique uses an R program to estimate bigram and trigram probabilities of language models for the protein sequence. Experimental results shows the appropriate fitting of exponential and linear smoothing curves defined over bigram and trigram sequences respectively, with very high model accuracy.
Keywords
Bigram Model, Language Model, Smoothing N-Gram Model, Trigram Model.
User
Subscription
Login to verify subscription
Font Size
Information
![](https://i-scholar.in/public/site/images/abstractview.png)
Abstract Views: 241
![](https://i-scholar.in/public/site/images/pdfview.png)
PDF Views: 2