Open Access Open Access  Restricted Access Subscription Access

Design and Implementation of a Two Level Scheduler for HADOOP Data Grids


Affiliations
1 Department of Computer Science & Engineering, PSG College of Technology, Peelamedu-641004, Coimbatore, Tamil Nadu, India
 

Hadoop is a large scale distributed processing infrastructure designed to handle data intensive applications. In a commercial large scale cluster framework, a scheduler distributes user jobs evenly among the cluster resources. The proposed work enhances Hadoop’s fair scheduler that queues the jobs for execution in a fine grained manner using task scheduling. In contrast, the proposed approach allows backfilling of jobs submitted to the scheduler. Thus job level and task level scheduling is enabled by this approach. The jobs are fairly scheduled with fairness among users, pools and priority. The outcome of the proposed work is that short narrow jobs will be executed in the slot if sufficient resource is not available for larger jobs. Thus shorter jobs get executed faster by the scheduler when compared to the existing fair scheduling policy that schedules tasks based on their fairness of remaining execution time. This approach prevents the starvation of smaller jobs if sufficient resources are available.

Keywords

Hadoop, Scheduling, Fair Share Scheduler, Backfilling.
User
Notifications
Font Size

Abstract Views: 156

PDF Views: 0




  • Design and Implementation of a Two Level Scheduler for HADOOP Data Grids

Abstract Views: 156  |  PDF Views: 0

Authors

G. Sudha Sadhasivam
Department of Computer Science & Engineering, PSG College of Technology, Peelamedu-641004, Coimbatore, Tamil Nadu, India
M. Anjali
Department of Computer Science & Engineering, PSG College of Technology, Peelamedu-641004, Coimbatore, Tamil Nadu, India

Abstract


Hadoop is a large scale distributed processing infrastructure designed to handle data intensive applications. In a commercial large scale cluster framework, a scheduler distributes user jobs evenly among the cluster resources. The proposed work enhances Hadoop’s fair scheduler that queues the jobs for execution in a fine grained manner using task scheduling. In contrast, the proposed approach allows backfilling of jobs submitted to the scheduler. Thus job level and task level scheduling is enabled by this approach. The jobs are fairly scheduled with fairness among users, pools and priority. The outcome of the proposed work is that short narrow jobs will be executed in the slot if sufficient resource is not available for larger jobs. Thus shorter jobs get executed faster by the scheduler when compared to the existing fair scheduling policy that schedules tasks based on their fairness of remaining execution time. This approach prevents the starvation of smaller jobs if sufficient resources are available.

Keywords


Hadoop, Scheduling, Fair Share Scheduler, Backfilling.