Open Access Open Access  Restricted Access Subscription Access

Arabic Text Categorization Algorithm Using Vector Evaluation Method


Affiliations
1 Computer Information Systems Department, Isra University, Amman, India
2 Computer Science Department, University of Jordan, Amman, Jordan
3 Software Engineering Department, Al-Ahliyya Amman University, Amman, Jordan
4 Business Information Systems Department, Isra University, Amman, Jordan
 

Text categorization is the process of grouping documents into categories based on their contents. This process is important to make information retrieval easier, and it became more important due to the huge textual information available online. The main problem in text categorization is how to improve the classification accuracy. Although Arabic text categorization is a new promising field, there are a few researches in this field. This paper proposes a new method for Arabic text categorization using vector evaluation. The proposed method uses a categorized Arabic documents corpus, and then the weights of the tested document's words are calculated to determine the document keywords which will be compared with the keywords of the corpus categorizes to determine the tested document's best category.

Keywords

Text Categorization, Arabic Text Classification, Information Retrieval, Data Mining, Machine Learning.
User
Notifications
Font Size

Abstract Views: 402

PDF Views: 146




  • Arabic Text Categorization Algorithm Using Vector Evaluation Method

Abstract Views: 402  |  PDF Views: 146

Authors

Ashraf Odeh
Computer Information Systems Department, Isra University, Amman, India
Aymen Abu-Errub
Computer Science Department, University of Jordan, Amman, Jordan
Qusai Shambour
Software Engineering Department, Al-Ahliyya Amman University, Amman, Jordan
Nidal Turab
Business Information Systems Department, Isra University, Amman, Jordan

Abstract


Text categorization is the process of grouping documents into categories based on their contents. This process is important to make information retrieval easier, and it became more important due to the huge textual information available online. The main problem in text categorization is how to improve the classification accuracy. Although Arabic text categorization is a new promising field, there are a few researches in this field. This paper proposes a new method for Arabic text categorization using vector evaluation. The proposed method uses a categorized Arabic documents corpus, and then the weights of the tested document's words are calculated to determine the document keywords which will be compared with the keywords of the corpus categorizes to determine the tested document's best category.

Keywords


Text Categorization, Arabic Text Classification, Information Retrieval, Data Mining, Machine Learning.