Language Independent Document Retrieval Using Unicode Standard

M. Vidhya; S. Aji

Vol 6, No 4 (2014)
Pages: 195-204
Published: 2014-08-01

Language Independent Document Retrieval Using Unicode Standard

M. Vidhya , S. Aji

Affiliations
1 Department of Computer Science, University of Kerala, Thiruvananthapuram, Kerala, India

Abstract
References
Article Metrics
Refbacks

In this paper, we presented a method to retrieve documents with unstructured text data written in different languages. Apart from the ordinary document retrieval systems, the proposed system can also process queries with terms in more than one language. Unicode, the universally accepted encoding standard is used to present the data in a common platform while converting the text data into Vector Space Model. We got notable F measure values in the experiments irrespective of languages used in documents and queries.