Amazigh Part-of-Speech Tagging Using Markov Models and Decision Trees

Samir Amri; Lahbib Zenkouar; Mohamed Outahajala

Amazigh Part-of-Speech Tagging Using Markov Models and Decision Trees

Samir Amri ¹, Lahbib Zenkouar ¹, Mohamed Outahajala ²

Affiliations
1 EMI Engineering School, Mohammed V University in Rabat, Morocco
2 Royal Institute of Amazigh Culture (IRCAM), Rabat, Morocco

Abstract
References
Article Metrics
Refbacks

The main goal of this work is the implementation of a new tool for the Amazigh part of speech tagging using Markov Models and decision trees.

After studying different approaches and problems of part of speech tagging, we have implemented a tagging system based on TreeTagger - a generic stochastic tagging tool, very popular for its efficiency. We have gathered a working corpus, large enough to ensure a general linguistic coverage. This corpus has been used to run the tokenization process, as well as to train TreeTagger. Then, we performed a straightforward outputs' evaluation on a small test corpus. Though restricted, this evaluation showed really encouraging results.