Comparison of Performance in Text Mining using Categorization of Unstructured Data

Lee Junyeon; Shin Seungsoo; Kim Jungju

doi:10.17485/ijst/2016/v9i24/134566

Comparison of Performance in Text Mining using Categorization of Unstructured Data

Lee Junyeon ¹, Shin Seungsoo ², Kim Jungju ³

Affiliations
1 Department of Media Engineering, Tongmyung University, Korea, Democratic People's Republic of
2 Department of Information Security, Tongmyung University, Korea, Democratic People's Republic of
3 Choonhae College of Health, Korea, Republic of

Abstract
References
Article Metrics
Refbacks

Background/Objectives: The text mining would help finding information to the users in the enormous documents. The text mining has been actively developed and utilized in various fields, mainly English-based document, but Study on the Korean text mining has been relatively limited. The importance of the Korean text mining has emerged with increasing big data including Korean text data, the needs for the intensive study and application of Big Data are increasing. Methods/Statistical Analysis: In this study, we compared the performance of these classifications by applying the method of Bayesian methods, k-NN, decision trees, SVM, and as a neural network in classification of unstructured newspaper article into given categories. Findings: In the experiment result, the SVM model has a high F-measure value relative to other models, and has shown stable results in the classification information and recall rate. Also, this model showed a high F-measure value in the classification of a more granular list. Application/Improvements: The methods of k-nn and decision tree show slightly lower performance than SVM, they are turned out to be appropriate models using classification problem cause of having advantages to easy interpretation and short learning time.

Keywords

Categorization, Decision Tree, k-NN, Naive Bayes, Text Mining.

About the Journal

Editorial Board

Current Issue

Archives

Advanced Search

Article Submission

Registration

Subscription

User

Information

Journal Content
Browse

Donations

Abstract Views: 175

PDF Views: 0

Comparison of Performance in Text Mining using Categorization of Unstructured Data

Abstract Views: 175 | PDF Views: 0

Authors

Lee Junyeon
Department of Media Engineering, Tongmyung University, Korea, Democratic People's Republic of

Shin Seungsoo
Department of Information Security, Tongmyung University, Korea, Democratic People's Republic of

Kim Jungju
Choonhae College of Health, Korea, Republic of

Abstract

Keywords

Categorization, Decision Tree, k-NN, Naive Bayes, Text Mining.

DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i24%2F134566

Username
Password
Remember me

Username
Password
Remember me

Indian Journal of Science and Technology

Comparison of Performance in Text Mining using Categorization of Unstructured Data

Keywords

Comparison of Performance in Text Mining using Categorization of Unstructured Data

Authors

Abstract

Keywords