Open Access Open Access  Restricted Access Subscription Access

Concept-Based Indexing in Text Information Retrieval


Affiliations
1 Department of Computer Science, Mouloud Mammeri University of Tizi-Ouzou, Algeria
2 Limose Laboratory, Department of Computer Science, M'Hamed Bouguera, University of Boumerdes, Algeria
 

Traditional information retrieval systems rely on keywords to index documents and queries. In such systems, documents are retrieved based on the number of shared keywords with the query. This lexicalfocused retrieval leads to inaccurate and incomplete results when different keywords are used to describe the documents and queries. Semantic-focused retrieval approaches attempt to overcome this problem by relying on concepts rather than on keywords to indexing and retrieval. The goal is to retrieve documents that are semantically relevant to a given user query. This paper addresses this issue by proposing a solution at the indexing level. More precisely, we propose a novel approach for semantic indexing based on concepts identified from a linguistic resource. In particular, our approach relies on the joint use of WordNet and WordNetDomains lexical databases for concept identification. Furthermore, we propose a semantic-based concept weighting scheme that relies on a novel definition of concept centrality. The resulting system is evaluated on the TIME test collection. Experimental results show the effectiveness of our proposition over traditional IR approaches.

Keywords

Information Retrieval, Concept Based Indexing, Concept Weighting, Word Sense Disambiguation, Wordnet, Wordnetdomains.
User
Notifications
Font Size

Abstract Views: 246

PDF Views: 131




  • Concept-Based Indexing in Text Information Retrieval

Abstract Views: 246  |  PDF Views: 131

Authors

Fatiha Boubekeur
Department of Computer Science, Mouloud Mammeri University of Tizi-Ouzou, Algeria
Wassila Azzoug
Limose Laboratory, Department of Computer Science, M'Hamed Bouguera, University of Boumerdes, Algeria

Abstract


Traditional information retrieval systems rely on keywords to index documents and queries. In such systems, documents are retrieved based on the number of shared keywords with the query. This lexicalfocused retrieval leads to inaccurate and incomplete results when different keywords are used to describe the documents and queries. Semantic-focused retrieval approaches attempt to overcome this problem by relying on concepts rather than on keywords to indexing and retrieval. The goal is to retrieve documents that are semantically relevant to a given user query. This paper addresses this issue by proposing a solution at the indexing level. More precisely, we propose a novel approach for semantic indexing based on concepts identified from a linguistic resource. In particular, our approach relies on the joint use of WordNet and WordNetDomains lexical databases for concept identification. Furthermore, we propose a semantic-based concept weighting scheme that relies on a novel definition of concept centrality. The resulting system is evaluated on the TIME test collection. Experimental results show the effectiveness of our proposition over traditional IR approaches.

Keywords


Information Retrieval, Concept Based Indexing, Concept Weighting, Word Sense Disambiguation, Wordnet, Wordnetdomains.