The main goal of this work is the implementation of a new tool for the Amazigh part of speech tagging using Markov Models and decision trees.
After studying different approaches and problems of part of speech tagging, we have implemented a tagging system based on TreeTagger - a generic stochastic tagging tool, very popular for its efficiency. We have gathered a working corpus, large enough to ensure a general linguistic coverage. This corpus has been used to run the tokenization process, as well as to train TreeTagger. Then, we performed a straightforward outputs' evaluation on a small test corpus. Though restricted, this evaluation showed really encouraging results.
Keywords
Amazigh, SVM, CRF, HMM, Machine Learning, POS Tagging.
User
Font Size
Information