PageRank Using MapReduce-An Open-Source Framework for Processing Large Data Sets

N. Rehna; N. Minni; F. Jasmine Natchial

PageRank Using MapReduce-An Open-Source Framework for Processing Large Data Sets

N. Rehna ¹, N. Minni ², F. Jasmine Natchial ³

Affiliations
1 Department of MCA, Bharathiyar College of Engineering and Technology, Karaikal, Puducherry, India
2 Department of Computer Science, Avvaiyar Government College for Women, Karaikal, Puducherry, India
3 Deparment of MCA, Bharathiyar College of Engineering and Technology, Karaikal, Puducherry, India

Subscribe/Renew Journal

MapReduce is simple data-parallel programming model designed for scalability and fault-tolerance and for processing and generating large data sets. It was initially created by Google for simplifying the development of large scale web search applications in data centers and has been proposed to form the basis of a ‘Data center computer’. Many real world tasks are expressible in this model. In this paper, a PageRank Algorithm is introduced for a hyperlink graph using MapReduce technique illustrated for a random web surfer. This algorithm computes the PageRank of several web pages which is distributed in the cloud. In this work, the Hyperlink Graph Page Rank (HGPR) algorithm is developed, using which the PageRanks can be computed and thereafter the most visited webpages can be traced out.
Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system.
The implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable. A typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use.

Keywords

Adjacency List, Cloud Computing, Dampling Factor, HGPR Algorithm, MapReduce, PageRank (PR).

I-Scholar

Journal Help

User

Subscription Login to verify subscription

Notifications

Journal Content
Browse

Font Size

Information

PageRank Using MapReduce-An Open-Source Framework for Processing Large Data Sets

Abstract Views: 332 | PDF Views: 2

Authors

N. Rehna
Department of MCA, Bharathiyar College of Engineering and Technology, Karaikal, Puducherry, India

N. Minni
Department of Computer Science, Avvaiyar Government College for Women, Karaikal, Puducherry, India

F. Jasmine Natchial
Deparment of MCA, Bharathiyar College of Engineering and Technology, Karaikal, Puducherry, India

Abstract

Keywords

Adjacency List, Cloud Computing, Dampling Factor, HGPR Algorithm, MapReduce, PageRank (PR).

Username
Password
Remember me

Username
Password
Remember me

Data Mining and Knowledge Engineering

Data Mining and Knowledge Engineering

PageRank Using MapReduce-An Open-Source Framework for Processing Large Data Sets

Subscribe/Renew Journal

Keywords

PageRank Using MapReduce-An Open-Source Framework for Processing Large Data Sets

Authors

Abstract

Keywords