The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


Due to the presence of large amounts of data and its exponential level generation, the manual approach of summarization takes more time, is biased, and needs linguistic professional experts. To avoid these substantial issues or to generate a succinct summary report, automatic text summarization is very much important. Three different approaches namely the statistical approach such as Term Frequency Inverse Document Frequency (TF-IDF), the topic modeling approach such as Latent Semantic Analysis (LSA), and graph-based approaches such as TextRank were applied to generate a concise summary for the benchmark the British Broadcasting Corporation (BBC) news articles summarization dataset. The domain specific implementations of each approach in the five domains of the dataset and domain-agnostic prospects were explored in the paper while drawing various insights. The generated summaries were evaluated using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) framework, leveraging precision, recall, and f-measure metrics. The approaches were not only able to achieve a commendable ROUGE score but also outperform the previous works on the dataset.

Keywords

LSA, NLP, ROUGE, TextRank, TF-IDF
User
Notifications
Font Size