Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Locality Aware Scheduling Using Prefetching Technique in Hadoop


Affiliations
1 Computer Engineering at GTU PG School, Ahmedabad, Gujarat, India
2 IT Department at Pune Institute of Computer Technology, Pune, Maharashtra, India
     

   Subscribe/Renew Journal


Hadoop is a hastily growing environment of components for fulfilling the Google MapReduce algorithms in a scalable fashion on commodity hardware. Hadoop qualifies users to store and process large capacities of data and analyze it in ways not previously potential with less scalable solutions or standard SQL-based tactics. MapReduce offers a favorable programming model for big data processing. Data Locality is of most concern in MapReduce as to improve the performance and to decrease the network traffic. Many algorithms are there for improving the performance based on locality of data. Somehow there are many defects or more future work is there to be done in this area. "Moving computation to data is cheaper than moving computation to data." By following this Hadoop principle, Data Locality is the more effective performance metric for effective computation. In the proposed system, a new different approach is given to achieve the data locality in map phase. Here, task is assigned to the requesting node if it has the local data. If requesting node has non local data then the data is pre-fetched to this node from the nearest node. We consider progress of node to start prefetching. This approach will improve performance with faster computation and reduce the network traffic.

Keywords

Hadoop, MapReduce, Data Locality, Prefetching.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 220

PDF Views: 1




  • Locality Aware Scheduling Using Prefetching Technique in Hadoop

Abstract Views: 220  |  PDF Views: 1

Authors

Utsav Prajapati
Computer Engineering at GTU PG School, Ahmedabad, Gujarat, India
Shyam Deshmukh
IT Department at Pune Institute of Computer Technology, Pune, Maharashtra, India

Abstract


Hadoop is a hastily growing environment of components for fulfilling the Google MapReduce algorithms in a scalable fashion on commodity hardware. Hadoop qualifies users to store and process large capacities of data and analyze it in ways not previously potential with less scalable solutions or standard SQL-based tactics. MapReduce offers a favorable programming model for big data processing. Data Locality is of most concern in MapReduce as to improve the performance and to decrease the network traffic. Many algorithms are there for improving the performance based on locality of data. Somehow there are many defects or more future work is there to be done in this area. "Moving computation to data is cheaper than moving computation to data." By following this Hadoop principle, Data Locality is the more effective performance metric for effective computation. In the proposed system, a new different approach is given to achieve the data locality in map phase. Here, task is assigned to the requesting node if it has the local data. If requesting node has non local data then the data is pre-fetched to this node from the nearest node. We consider progress of node to start prefetching. This approach will improve performance with faster computation and reduce the network traffic.

Keywords


Hadoop, MapReduce, Data Locality, Prefetching.