The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


Background/Objective:A web forum is a problem-solving online community.Web forum research activitieshave been focused on answer mining with the assumption that the starting post is a question post. This paper proposes methods for mining standard web forum questions. Methods/Statistical Analysis:Popular methods for web forum question post detection are question mark, question words, higher n-grams and sequential pattern mining. These methods have problem of low detection rate and implementation complexity. Implemented in this paper is hybridization of simple bag-of-words model with web forum metadata, simple rule of question mark and question words. Dimensional reduction was performed using chi-square and wrapper techniques. Findings:The quality of web forum question posts varies from excellent to mediocre or even spam. Detecting good question posts is non-trivial. It requires utilization of salient features. Combination of simple rule of question mark and question words with forum metadata performed better than each of the two.Integration of bag-of-words model with simple rule of question marks, question words and forum metadata enhances question post detection. Dimensionality reduction using chi-square were found to perform better than other popular filters like info gain, gain ratio and symmetric uncertain. Applications/Improvements: Three publicly available datasets of varying technical degrees were used for the experiments.The experimental results revealed that an enhanced bag-of-words model can perform better than complex techniques that implement higher N-gram with part-of-speech tagging.

Keywords

Bag-of-words, Forum Metadata, Web Forum,Question Detection, Dimensionality Reduction, Web Forum Question
User