Open Access
Subscription Access
Open Access
Subscription Access
A Survey on Data Extraction from Web Pages
Subscribe/Renew Journal
Internet provides huge amount of information. The amount of information on the web is growing at an astonishing rate. Web can be considered as the largest knowledge base. Web pages contain a lot of information. Extracting data from the web pages are very difficult. This is mainly because of the complex structure of the web pages. And there isn’t any uniformity when the structure of the web page is considered. Due to the lack of any uniform structure of Web information sources, access to this huge collection of information has been limited to browsing and searching. Many a times the data need to be extracted from the web pages so as to facilitate different applications. Also, extracting relevant data alone is a tedious task. Therefore, the availability of robust, flexible extraction methods that transform the Web pages into program-friendly structures such as a relational database has become a great necessity. Although many approaches for data extraction from Web pages have been developed, there has been limited effort to compare such tools. This survey paper mentions some of the techniques for web data extraction.
Keywords
Semi-Structured Data, Data Extraction, Web Database, Web Mining, Wrapper Generation.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 309
PDF Views: 3