TNT Tagger for Malayalam with Fuzzy Rule Based Learning

Alen Jacob; Amal Babu; R. R. Rajeev; P. C. Reghu Raj

TNT Tagger for Malayalam with Fuzzy Rule Based Learning

Alen Jacob ¹, Amal Babu ¹, R. R. Rajeev ², P. C. Reghu Raj ³

Affiliations
1 Computational Linguistics, Government Engineering College, Sreekrishnapuram, Palakkad, Kerala, India
2 VRCLC, IIITM-K, Thiruvananthapuram, Kerala, India
3 Dept. of Computer Science and Engineering, Government Engineering College, Sreekrishnapuram, Palakkad, Kerala, India

Abstract
References
Article Metrics
Refbacks

TnT is an efficient statistical Parts-of-speech (POS) Tagger based on Hidden Markov Model. TnT performs well on known word sequences. But, the performance degrades with increase in the number of unknown words. In this paper, we propose a method to overcome this performance degradation using fuzzy rules. Fuzzy rule based model is designed to provide TnT with sufficient information about the tag of unknown words without degrading the performance of TnT. On processing an unknown word from the input, the TnT tagger relies on the probability distribution of words having the same suffix within the training corpus. In Indian languages like Malayalam, the POS tag of an unknown word depends not only on suffix. Due to high inflectional and free order nature, the dependency is rather complex than the one captured by suffix tag distribution probabilities. When TnT with fuzzy rule based learning encounters an unknown word, the TnT generates a set of possible tags for the given word based on the fuzzy rules matched by the word. If the word does not match any fuzzy rule then the model depends upon the probability distribution of the suffix. This approach guarantees that the performance of TnT will only be improved from its normal performance.

I-Scholar

Journal Help

User

Notifications

Journal Content
Browse

Font Size

Information

Abstract Views: 239

PDF Views: 0

TNT Tagger for Malayalam with Fuzzy Rule Based Learning

Abstract Views: 239 | PDF Views: 0

Authors

Alen Jacob
Computational Linguistics, Government Engineering College, Sreekrishnapuram, Palakkad, Kerala, India

Amal Babu
Computational Linguistics, Government Engineering College, Sreekrishnapuram, Palakkad, Kerala, India

R. R. Rajeev
VRCLC, IIITM-K, Thiruvananthapuram, Kerala, India

P. C. Reghu Raj
Dept. of Computer Science and Engineering, Government Engineering College, Sreekrishnapuram, Palakkad, Kerala, India

Username
Password
Remember me

Username
Password
Remember me

Research Cell: An International Journal of Engineering Sciences

Research Cell: An International Journal of Engineering Sciences

TNT Tagger for Malayalam with Fuzzy Rule Based Learning

TNT Tagger for Malayalam with Fuzzy Rule Based Learning

Authors

Abstract