Clustering of Web Page for Different Domains Using Data Extraction and Self Organizing Map

Chhaya Varade; Bhupesh Gour; Asif Ullah Khan; Shailendra Jain

Clustering of Web Page for Different Domains Using Data Extraction and Self Organizing Map

Chhaya Varade ¹, Bhupesh Gour ², Asif Ullah Khan ³, Shailendra Jain ³

Affiliations
1 C.S.E., Technocrats Institute of Technology, Bhopal, M.P., India
2 Dept. of C.S.E., Technocrats Institute of Technology, Bhopal, M.P., India
3 Dept of C.S.E., Technocrats Institute of Technology, Bhopal, M.P., India

Subscribe/Renew Journal

Given the rapid growth and success of public information sources on the World Wide Web, it is increasingly attractive to extract data from these sources and make it available for further processing by end users and application programs. Data extracted from Web sites can serve as the springboard for a variety of tasks, including information retrieval (e.g. business intelligence), event monitoring (news and stock market), and electronic commerce (shopping comparison). Extracting structured data from Web sites is not a trivial task. Most of the information on the Web today is in the form of Hypertext Markup Language (HTML) documents which are viewed by humans with a browser. A sophisticated method to organize the layout of the information and assist user navigation is therefore particularly important. Data Extraction is the process of retrieving data out of data sources further data processing. Online data exists in the form of a web record. Depending on the end user query, the query results are generated by web databases and from this query results pages. The main objective of this paper is to extract and align important data from different domains with the help of HTML tags and its value. After extracting data, Self Organizing Map (SOM) will classify the extracted data from different domains in the form of clusters. Clustering is the process of grouping physical or abstract objects into classes of similar objects.

Keywords

Data Extraction, Data Record Alignment, Clustering, QRR, SOM.

I-Scholar

Journal Help

User

Subscription Login to verify subscription

Notifications

Journal Content
Browse

Font Size

Information

Clustering of Web Page for Different Domains Using Data Extraction and Self Organizing Map

Abstract Views: 348 | PDF Views: 3

Authors

Chhaya Varade
C.S.E., Technocrats Institute of Technology, Bhopal, M.P., India

Bhupesh Gour
Dept. of C.S.E., Technocrats Institute of Technology, Bhopal, M.P., India

Asif Ullah Khan
Dept of C.S.E., Technocrats Institute of Technology, Bhopal, M.P., India

Shailendra Jain
Dept of C.S.E., Technocrats Institute of Technology, Bhopal, M.P., India

Abstract

Keywords

Data Extraction, Data Record Alignment, Clustering, QRR, SOM.

Username
Password
Remember me

Username
Password
Remember me

Data Mining and Knowledge Engineering

Data Mining and Knowledge Engineering

Clustering of Web Page for Different Domains Using Data Extraction and Self Organizing Map

Subscribe/Renew Journal

Keywords

Clustering of Web Page for Different Domains Using Data Extraction and Self Organizing Map

Authors

Abstract

Keywords