Open Access Open Access  Restricted Access Subscription Access

Baoulé Related Parallel Corpora for Machine Translation Tasks: Mtbci-1.0


Affiliations
1 Faculty of Computer Science via distance learning, Bircham International University, Madrid, Spain

According to the Ethnologue platform, we have 7,164 known living languages in the World, and not all of them have data available over the internet to facilitate Artificial Intelligence (AI) tasks such as Machine Translation (MT). Consequently, there is a need for thorough Data Engineering tasks for most of these languages. Especially, the Baoulé living language normalized as ISO 639-3 (bci) is not yet supported on popular worldwide free translation platform such as Microsoft Translator, nor on the Official Wikipedias. In this paper, we have proposed the "Baoulé Related Parallel Corpora for Machine Translation tasks: mtBCI-1.0" to make parallel Baoulé-related datasets available to the scientific community for AI tasks implying Machine Translation. We have shown that, after a brief presentation of the Baoulé language in the proposed approach, we will focus on the Data Engineering Process itself before providing a baseline proving that the collected data is of scientific interest.

Keywords

Artificial Intelligence, Machine Learning, Machine Translation, Data Engineering, Dataset, Parallel Corpora, Baoulé language (bci)
User
Notifications
Font Size

Abstract Views: 2




  • Baoulé Related Parallel Corpora for Machine Translation Tasks: Mtbci-1.0

Abstract Views: 2  | 

Authors

Kouassi Konan Jean-Claude
Faculty of Computer Science via distance learning, Bircham International University, Madrid, Spain

Abstract


According to the Ethnologue platform, we have 7,164 known living languages in the World, and not all of them have data available over the internet to facilitate Artificial Intelligence (AI) tasks such as Machine Translation (MT). Consequently, there is a need for thorough Data Engineering tasks for most of these languages. Especially, the Baoulé living language normalized as ISO 639-3 (bci) is not yet supported on popular worldwide free translation platform such as Microsoft Translator, nor on the Official Wikipedias. In this paper, we have proposed the "Baoulé Related Parallel Corpora for Machine Translation tasks: mtBCI-1.0" to make parallel Baoulé-related datasets available to the scientific community for AI tasks implying Machine Translation. We have shown that, after a brief presentation of the Baoulé language in the proposed approach, we will focus on the Data Engineering Process itself before providing a baseline proving that the collected data is of scientific interest.

Keywords


Artificial Intelligence, Machine Learning, Machine Translation, Data Engineering, Dataset, Parallel Corpora, Baoulé language (bci)