Open Access Open Access  Restricted Access Subscription Access

Wavelet Tree based Hybrid Geo-Textual Indexing Technique for Geographical Search


Affiliations
1 Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College, Ghaziabad – 201009, Uttar Pradesh, India
2 Department of Computer Science, Jaypee Institute of Information Technology, Noida – 201309, Uttar Pradesh, India
 

Background/Objectives: There is significant commercial and research interest in location based search for search engines. Searching of keywords belonging to one or more locations (geographic references) requires geographical web search and ranking on the basis of spatial and textual relevancy. This type of search sets the requirement of spatial and textual indexing. Methods/Statistical Analysis: This paper uses a new spatial-textual hybrid indexing technique, based on Wavelet Tree (WT) to handle point and region queries for Geographical Information Retrieval. Here, WT data structure is used for both textual and spatial indexing. Minimum Bounding Rectangles (MBRs) of different geographical points (latitude, longitude) is created for designing hybrid index. For searching textual keywords, we need to design inverted index. It is created using wavelet tree. Also, a spatial-textual relevancy scheme is used for relevant document retrieval to the end users. Findings: The algorithm has been implemented in order to measure the performance in terms of search time. Approximately 40,000 Wikipedia pages have been crawled and stored in database along with geographical coordinates (latitude, longitude) of locations in India to design MBRs of these locations. The results show that wavelet tree based hybrid index algorithm performance increase with the increase in query length. For small query length, B/R* tree performs better but for larger query lengths, wavelet tree based hybrid index outperforms other techniques. Precision and recall of web documents have also been calculated using hybrid index. For varying query lengths, the precision and recalls are varying which shows that by reducing the time in search time precision and recall are preserve. Applications/Improvement: Our algorithm outperforms the existing algorithms both in terms of simplicity in implementation and searching time. In future we will propose a compression technique on hybrid index to minimize the space taken by hybrid index that will further improve the searching time in case of single as well as multiple geographical references of documents.

Keywords

Hybrid-indexing, Indexing, Information Retrieval, Wavelet Tree
User

Abstract Views: 209

PDF Views: 0




  • Wavelet Tree based Hybrid Geo-Textual Indexing Technique for Geographical Search

Abstract Views: 209  |  PDF Views: 0

Authors

Arun Yadav
Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College, Ghaziabad – 201009, Uttar Pradesh, India
Divakar Yadav
Department of Computer Science, Jaypee Institute of Information Technology, Noida – 201309, Uttar Pradesh, India

Abstract


Background/Objectives: There is significant commercial and research interest in location based search for search engines. Searching of keywords belonging to one or more locations (geographic references) requires geographical web search and ranking on the basis of spatial and textual relevancy. This type of search sets the requirement of spatial and textual indexing. Methods/Statistical Analysis: This paper uses a new spatial-textual hybrid indexing technique, based on Wavelet Tree (WT) to handle point and region queries for Geographical Information Retrieval. Here, WT data structure is used for both textual and spatial indexing. Minimum Bounding Rectangles (MBRs) of different geographical points (latitude, longitude) is created for designing hybrid index. For searching textual keywords, we need to design inverted index. It is created using wavelet tree. Also, a spatial-textual relevancy scheme is used for relevant document retrieval to the end users. Findings: The algorithm has been implemented in order to measure the performance in terms of search time. Approximately 40,000 Wikipedia pages have been crawled and stored in database along with geographical coordinates (latitude, longitude) of locations in India to design MBRs of these locations. The results show that wavelet tree based hybrid index algorithm performance increase with the increase in query length. For small query length, B/R* tree performs better but for larger query lengths, wavelet tree based hybrid index outperforms other techniques. Precision and recall of web documents have also been calculated using hybrid index. For varying query lengths, the precision and recalls are varying which shows that by reducing the time in search time precision and recall are preserve. Applications/Improvement: Our algorithm outperforms the existing algorithms both in terms of simplicity in implementation and searching time. In future we will propose a compression technique on hybrid index to minimize the space taken by hybrid index that will further improve the searching time in case of single as well as multiple geographical references of documents.

Keywords


Hybrid-indexing, Indexing, Information Retrieval, Wavelet Tree



DOI: https://doi.org/10.17485/ijst%2F2015%2Fv8i33%2F123324