Open Access
Subscription Access
Open Access
Subscription Access
Data Mining Approaches for Web Spam Detection
Subscribe/Renew Journal
Web spam is a serious problem for search engines because the quality of their results can be severely degraded by the presence of this kind of page. In this paper, we present an efficient spam detection system based on a classifier that combines new link-based features with language-model (LM)-based ones. We have specifically applied the Kullback-Leibler divergence on different combinations of these sources of information in order to characterize the relationship between two linked pages. In this paper, we present an efficient spam detection system based on a Hybrid clustering that combines K-means and SVM and then classified by using C4.5 with Qualified link-based features and Language Model(LM) based once. The result is an accurate system for detecting Web spam using fewer features.
Keywords
Content Analysis, Information Retrieval, Language Models (LMs), Link Integrity, Web Spam Detection.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 298
PDF Views: 2