The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


Parts-of-speech tagging is the process of labeling each word in a sentence. A tag mentions the word's usage in the sentence. Usually, these tags indicate syntactic classification like noun or verb, and sometimes include additional information, with case markers (number, gender etc) and tense markers. A large number of current language processing systems use a parts-of-speech tagger for pre-processing.

There are mainly two approaches usually followed in Parts of Speech Tagging. Those are Rule based Approach and Stochastic Approach. Rule based Approach use predefined handwritten rules. This is the oldest approach and it use lexicon or dictionary for reference. Stochastic Approach use probabilistic and statistical information to assign tag to words. It use large corpus, so that Time complexity and Space complexity is high whereas Rule base approach has less complexity for both Time and Space. Stochastic Approach is the widely used one nowadays because of its accuracy.

Malayalam is a Dravidian family of languages, inflectional with suffixes with the ischolar_main word forms. The currently used Algorithms are efficient Machine Learning Algorithms but these are not built for Malayalam. So it affects the accuracy of the result of Malayalam POS Tagging.

My proposed Approach use Dictionary entries along with adjacent tag information. This algorithm use Multithreaded Technology. Here tagging done with the probability of the occurrence of the sentence structure along with the dictionary entry.


Keywords

NLP, POS Tagger, Rule Based Approach, Stochastic Approach, Multithreading, Dictionary Entry, Malayalam.
User
Notifications
Font Size