A Novel Approach for Mining Web Documents Based on Bayesian Learning Classifier Systems

M. Deepa; P. Tamijeselvy

A Novel Approach for Mining Web Documents Based on Bayesian Learning Classifier Systems

M. Deepa ¹, P. Tamijeselvy ²

Affiliations
1 Kalaignar Karunaidhi Institute of Tech, India
2 VLBJCET, India

Subscribe/Renew Journal

Abstract
References
Article Metrics
Refbacks

Web mining is a new area of data mining. Since web is one of the biggest repositories of data, analyzing and exploring regularities using data mining in web user behavior can improve system performance and enhance the quality and delivery of Internet information services to the end user. Clustering and classification have been useful in active areas of machine learning research that promise to help us cope with the problem of information overload on the Internet. BIRCH is a clustering algorithm designed to operate under the assumption "the amount of memory available is limited, whereas the dataset can be arbitrary large". The algorithm generates "a compact dataset summary" minimizing the I/O cost involved Also the effect of noise and uncertainty are major issues in Web mining. Traditionally, probability is used to measure the uncertainty in the system. The Bayesian approach provides a mathematical Bayes’ theorem to manipulate existing beliefs with some new evidence in order to form new beliefs. Bayesian inference has been seen in the literature as a robust method to deal with noise and uncertainty. Therefore, we propose a modification of UCS, using Bayesian update. This method is able to achieve higher accuracy than UCS and requires only half of the learning time to converge. The algorithm thus minimizes the outliers involved and contains enough information to apply the well known SMOKA - Smoothened k-means clustering algorithm to the set of summaries and to generate the partitions of the original dataset. We expect that the proposed method to work more quickly because it reduces the time required exploring a search space and finding a correct action for a condition.

Keywords

Algorithms: BIRCH (Balanced Iterative Reducing and Clustering Algorithm), Bayes Theorem, K-Means Algorithm, BCS.

I-Scholar

Journal Help

User

Subscription Login to verify subscription

Notifications

Journal Content
Browse

Font Size

Information

Abstract Views: 289

PDF Views: 4

A Novel Approach for Mining Web Documents Based on Bayesian Learning Classifier Systems

Abstract Views: 289 | PDF Views: 4

Authors

M. Deepa
Kalaignar Karunaidhi Institute of Tech, India

P. Tamijeselvy
VLBJCET, India

Abstract

Keywords

Algorithms: BIRCH (Balanced Iterative Reducing and Clustering Algorithm), Bayes Theorem, K-Means Algorithm, BCS.

Username
Password
Remember me

Username
Password
Remember me

Data Mining and Knowledge Engineering

Data Mining and Knowledge Engineering

A Novel Approach for Mining Web Documents Based on Bayesian Learning Classifier Systems

Subscribe/Renew Journal

Keywords

A Novel Approach for Mining Web Documents Based on Bayesian Learning Classifier Systems

Authors

Abstract

Keywords