Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

WordNet Based Concept Weight Using Semantic Relation for Clustering Documents


Affiliations
1 Department of Computer Engineering, Marwadi Education Foundation Group of Institute, Rajkot, India
2 Department of Computer Engineering, Marwadi Education Foundation Group of Institutions, Rajkot, India
     

   Subscribe/Renew Journal


This paper presents a novel technique by combining regular clustering techniques with information extracted from WordNet. There are two approaches for traditional clustering algorithms utilize in documents clustering area. First approach work with documents as bag of words and consider each term as independent (means ignore semantic relationships between words). Second approach can determine semantics using WordNet. The proposed technique isutilizing second approach with different (identity, synonym, direct hypernym and meronym relation) & weighted (identity>synonym>direct hypernym>meronym) semantic relation. Concepts are weighted by generating concepts chain of related concepts. It utilizes the WordNet in turn to create low dimensional vector space which allows to build an efficient clustering technique. The proposed technique can improve cluster quality as well as achieve low dimensional vector space compared to other techniques.

Keywords

Document Clustering, K-Means Algorithm, WordNet, Concept Weighting, Synonym, Hypernym, Meronym.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 227

PDF Views: 2




  • WordNet Based Concept Weight Using Semantic Relation for Clustering Documents

Abstract Views: 227  |  PDF Views: 2

Authors

Apeksha Charola
Department of Computer Engineering, Marwadi Education Foundation Group of Institute, Rajkot, India
Sahista Machchhar
Department of Computer Engineering, Marwadi Education Foundation Group of Institutions, Rajkot, India

Abstract


This paper presents a novel technique by combining regular clustering techniques with information extracted from WordNet. There are two approaches for traditional clustering algorithms utilize in documents clustering area. First approach work with documents as bag of words and consider each term as independent (means ignore semantic relationships between words). Second approach can determine semantics using WordNet. The proposed technique isutilizing second approach with different (identity, synonym, direct hypernym and meronym relation) & weighted (identity>synonym>direct hypernym>meronym) semantic relation. Concepts are weighted by generating concepts chain of related concepts. It utilizes the WordNet in turn to create low dimensional vector space which allows to build an efficient clustering technique. The proposed technique can improve cluster quality as well as achieve low dimensional vector space compared to other techniques.

Keywords


Document Clustering, K-Means Algorithm, WordNet, Concept Weighting, Synonym, Hypernym, Meronym.