Open Access Open Access  Restricted Access Subscription Access

A Site Rank-Based Swarming Ordering Approach


Affiliations
1 Department of Computer Application, Government Geetanjali Girls PG College, Bhopal, India
 

Search engines are in performance a major essential role in discovering information nowadays. Due to limitations of network bandwidth and hardware, search engines cannot obtain the entire information of the web and have to download the most essential pages first.

In these paper, we propose a swarming ordering strategy, which have based on SiteRank, and compare it with several swarming ordering strategies. All the four strategies make an optimization for the naive swarming more or less. At the beginning of the swarming process, all the strategies can crawl the pages with high PageRank. When downloading 48% of the pages, the sum of PageRank is over 58% even for the worst one. At the later phase of swarming, the sum of PageRank varies slowly and reaches to unique finally. The objective of these strategies is to download the most essential pages early during the crawl. Experimental results indicate that SiteRank-based strategy can work Efficiently in discovering essential pages under the PageRank evaluation of page quality.


Keywords

Web Crawler, Swarming Ordering Strategy, Web Page Importance, Siterank.
User
Notifications
Font Size

Abstract Views: 222

PDF Views: 6




  • A Site Rank-Based Swarming Ordering Approach

Abstract Views: 222  |  PDF Views: 6

Authors

Maya Ram Atal
Department of Computer Application, Government Geetanjali Girls PG College, Bhopal, India
Roohi Ali
Department of Computer Application, Government Geetanjali Girls PG College, Bhopal, India
Ram Kumar
Department of Computer Application, Government Geetanjali Girls PG College, Bhopal, India
Rajendra Kumar Malviya
Department of Computer Application, Government Geetanjali Girls PG College, Bhopal, India

Abstract


Search engines are in performance a major essential role in discovering information nowadays. Due to limitations of network bandwidth and hardware, search engines cannot obtain the entire information of the web and have to download the most essential pages first.

In these paper, we propose a swarming ordering strategy, which have based on SiteRank, and compare it with several swarming ordering strategies. All the four strategies make an optimization for the naive swarming more or less. At the beginning of the swarming process, all the strategies can crawl the pages with high PageRank. When downloading 48% of the pages, the sum of PageRank is over 58% even for the worst one. At the later phase of swarming, the sum of PageRank varies slowly and reaches to unique finally. The objective of these strategies is to download the most essential pages early during the crawl. Experimental results indicate that SiteRank-based strategy can work Efficiently in discovering essential pages under the PageRank evaluation of page quality.


Keywords


Web Crawler, Swarming Ordering Strategy, Web Page Importance, Siterank.