Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

A Survey on Title and Keyword based Extraction of News Contents using Machine Learning


Affiliations
1 Department of Industrial and Systems Engineering, Virginia Polytechnic Institute and State University, Virgin Islands, U.S.
     

   Subscribe/Renew Journal


Newspapers provide a valuable resource for information. Recently, many models for content extraction have been proposed, such models are highly scalable and inexpensive in time, but most models are difficult to extract content accurately and completely, and are prone to noise. In the past decade, most major newspapers and magazines have built websites that provide news or other material. Also, only online newspapers appeared. The quality and quantity of content displayed on all of these websites has been greatly improved, providing valuable information resources. In this article, we investigate various data mining methods used to process extracted content information and summaries of results. This survey examined popular and effective machine learning techniques and their advantages and disadvantages.


Keywords

Web News, Data Mining, Information Extraction, Title-Based Extraction, Machine Learning.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 211

PDF Views: 1




  • A Survey on Title and Keyword based Extraction of News Contents using Machine Learning

Abstract Views: 211  |  PDF Views: 1

Authors

William Cook
Department of Industrial and Systems Engineering, Virginia Polytechnic Institute and State University, Virgin Islands, U.S.
J Waycott
Department of Industrial and Systems Engineering, Virginia Polytechnic Institute and State University, Virgin Islands, U.S.

Abstract


Newspapers provide a valuable resource for information. Recently, many models for content extraction have been proposed, such models are highly scalable and inexpensive in time, but most models are difficult to extract content accurately and completely, and are prone to noise. In the past decade, most major newspapers and magazines have built websites that provide news or other material. Also, only online newspapers appeared. The quality and quantity of content displayed on all of these websites has been greatly improved, providing valuable information resources. In this article, we investigate various data mining methods used to process extracted content information and summaries of results. This survey examined popular and effective machine learning techniques and their advantages and disadvantages.


Keywords


Web News, Data Mining, Information Extraction, Title-Based Extraction, Machine Learning.