Open Access Open Access  Restricted Access Subscription Access

Improving Question Classification by Feature Extraction and Selection


Affiliations
1 VNU University of Engineering and Technology, Ha Noi City, India
2 Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, India
 

Question classification is the task of predicting the entity type of the answering sentence for a given question in natural language. It plays an important role in finding or constructing accurate answers and therefore helps to improve quality of automated question answering systems. Different lexical, syntactical and semantic features was extracted automatically from a question to serve the classification in previous studies. However, combining all those features doesn't always give the best results for all types of questions. Different from previous studies, this paper focuses on the problem of how to extract and select efficient features adapting to each different types of question. We first propose a method of using a feature selection algorithm to determine appropriate features corresponding to different question types. Secondly, we design a new type of features, which is based on question patterns. We tested our proposed approach on the benchmark dataset TREC and using Support Vector Machines (SVM) for the classification algorithm. The experiment shows obtained results with the accuracies of 95.2% and 91.6% for coarse grain and fine grain data sets respectively, which are much better in comparison with the previous studies.

Keywords

Feature Extraction, Feature Selection, Question Answering Systems, Question Classification, Question Patterns.
User

Abstract Views: 189

PDF Views: 0




  • Improving Question Classification by Feature Extraction and Selection

Abstract Views: 189  |  PDF Views: 0

Authors

Nguyen Van-Tu
VNU University of Engineering and Technology, Ha Noi City, India
Le Anh-Cuong
Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, India

Abstract


Question classification is the task of predicting the entity type of the answering sentence for a given question in natural language. It plays an important role in finding or constructing accurate answers and therefore helps to improve quality of automated question answering systems. Different lexical, syntactical and semantic features was extracted automatically from a question to serve the classification in previous studies. However, combining all those features doesn't always give the best results for all types of questions. Different from previous studies, this paper focuses on the problem of how to extract and select efficient features adapting to each different types of question. We first propose a method of using a feature selection algorithm to determine appropriate features corresponding to different question types. Secondly, we design a new type of features, which is based on question patterns. We tested our proposed approach on the benchmark dataset TREC and using Support Vector Machines (SVM) for the classification algorithm. The experiment shows obtained results with the accuracies of 95.2% and 91.6% for coarse grain and fine grain data sets respectively, which are much better in comparison with the previous studies.

Keywords


Feature Extraction, Feature Selection, Question Answering Systems, Question Classification, Question Patterns.



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i17%2F132889