Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Survey on Clustering Algorithms for Text Mining


Affiliations
1 Department of Information Science & Engineering, University of Paris, France
     

   Subscribe/Renew Journal


Clustering is the process of combining groups of similar data objects in the same group based on similarity criteria (i.e. based on property groups). Typically, this cluster of documents is considered a centralized process. The application of this document cluster is done in two ways: online or offline. Of the two types, online cluster applications are generally more limited due to availability issues than offline applications. With this document clustering, you can complete a variety of tasks such as grouping domain-based documents, analyzing customer feedback, and finding meaningful hidden topics across all documents. The data used for clustering is used for normalization. In terms of efficiency and accuracy, the K-means produces better results compared to other algorithms.

Keywords

Clustering, K-Means, Hierarchical, Expectation and Maximization, Density Based Algorithm, Normalization.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 272

PDF Views: 1




  • Survey on Clustering Algorithms for Text Mining

Abstract Views: 272  |  PDF Views: 1

Authors

Dawlat A. Sayed
Department of Information Science & Engineering, University of Paris, France
Sohair R. Fahmy
Department of Information Science & Engineering, University of Paris, France

Abstract


Clustering is the process of combining groups of similar data objects in the same group based on similarity criteria (i.e. based on property groups). Typically, this cluster of documents is considered a centralized process. The application of this document cluster is done in two ways: online or offline. Of the two types, online cluster applications are generally more limited due to availability issues than offline applications. With this document clustering, you can complete a variety of tasks such as grouping domain-based documents, analyzing customer feedback, and finding meaningful hidden topics across all documents. The data used for clustering is used for normalization. In terms of efficiency and accuracy, the K-means produces better results compared to other algorithms.

Keywords


Clustering, K-Means, Hierarchical, Expectation and Maximization, Density Based Algorithm, Normalization.