Open Access Open Access  Restricted Access Subscription Access

Amazigh Part-of-Speech Tagging Using Markov Models and Decision Trees


Affiliations
1 EMI Engineering School, Mohammed V University in Rabat, Morocco
2 Royal Institute of Amazigh Culture (IRCAM), Rabat, Morocco
 

The main goal of this work is the implementation of a new tool for the Amazigh part of speech tagging using Markov Models and decision trees.

After studying different approaches and problems of part of speech tagging, we have implemented a tagging system based on TreeTagger - a generic stochastic tagging tool, very popular for its efficiency. We have gathered a working corpus, large enough to ensure a general linguistic coverage. This corpus has been used to run the tokenization process, as well as to train TreeTagger. Then, we performed a straightforward outputs' evaluation on a small test corpus. Though restricted, this evaluation showed really encouraging results.


Keywords

Amazigh, SVM, CRF, HMM, Machine Learning, POS Tagging.
User
Notifications
Font Size

Abstract Views: 262

PDF Views: 158




  • Amazigh Part-of-Speech Tagging Using Markov Models and Decision Trees

Abstract Views: 262  |  PDF Views: 158

Authors

Samir Amri
EMI Engineering School, Mohammed V University in Rabat, Morocco
Lahbib Zenkouar
EMI Engineering School, Mohammed V University in Rabat, Morocco
Mohamed Outahajala
Royal Institute of Amazigh Culture (IRCAM), Rabat, Morocco

Abstract


The main goal of this work is the implementation of a new tool for the Amazigh part of speech tagging using Markov Models and decision trees.

After studying different approaches and problems of part of speech tagging, we have implemented a tagging system based on TreeTagger - a generic stochastic tagging tool, very popular for its efficiency. We have gathered a working corpus, large enough to ensure a general linguistic coverage. This corpus has been used to run the tokenization process, as well as to train TreeTagger. Then, we performed a straightforward outputs' evaluation on a small test corpus. Though restricted, this evaluation showed really encouraging results.


Keywords


Amazigh, SVM, CRF, HMM, Machine Learning, POS Tagging.