Improved Fair Scheduling Algorithm for Hadoop Clustering
Traditional way of storing such a huge amount of data is not convenient because processing those data in the later stages is very tedious job. So nowadays, Hadoop is used to store and process large amount of data. When we look at the statistics of data generated in the recent years it is very high in the last 2 years. Hadoop is a good framework to store and process data efficiently. It works like parallel processing and there is no failure or data loss as such due to fault tolerance. Job scheduling is an important process in Hadoop Map Reduce. Hadoop comes with three types of schedulers namely FIFO (First in first out), Fair and Capacity Scheduler. The schedulers are now a pluggable component in the Hadoop Map Reduce framework. This paper talks about the native job scheduling algorithms in Hadoop. Fair scheduling algorithm is analysed with its algorithm considering its response time, throughput and performance. Advantages and drawbacks of fair scheduling algorithm is discussed. Improvised fair scheduling algorithm is proposed with new strategy. Analysis is made with respect to response time, throughput and performance is calculated in naive fair scheduling and improvised fair scheduling. Improvised fair Scheduling algorithms is used in the cases where there is jobs with high and less processing time.
Keywords
- M. Ramla, “Significance of various Hadoop job schedulers – A retrospective”, international journal of engineering sciences & research technology, No: 2277-9655, 2016.
- H. Bhosale and D.Gadekar, “A Review Paper on Big Data and Hadoop”, International Journal of Scientific and Research Publications, 4(10), 2014.
- J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters”, Communications of the ACM, 51(1), Pp. 107-113, 2008.
- A. Gates et al, “Building a High-Level Dataflow System on top of MapReduce: The Pig Experience,” Proceedings of the VLDB Endowment, 2, (2), Pp. 1414-1425, 2009.
- A. Kadam and P. Deshmukh, “A Review on Distributed File System in Hadoop”, International Journal of Engineering Research & Technology, 4(2), 2015.
- H. Liao et al,”Multi-Dimensional Index on Hadoop Distributed File System”, Fifth IEEE International Conference on Networking, Architectureand Storage, 2010.
- https://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/.
- M. Almeer, “Hadoop MapReduce for remote sensing image analysis”, 2, Pp. 4, 2012.
- Z. Zhao, “User-Based Collaborative-Filtering Algorithms on Hadoop”, Third International conference on knowledge discovery and Data Mining, 2010.
- S. Pakize, “A Comprehensive view of Hadoop MR Scheduling Algorithms”, International journal of computer networks and communications security, 2(9), 2014.
- H. Patel and R. Sonaliya, “Improving Job Scheduling in Hadoop Mapreduce”,International journal of innovative research in technology, 2(1) , 2015.
- M. Zaharia et al, “Improving MapReduce performance in heterogeneous environments.”, Operating systems design and implementation: Proceedings of 8th USENIX conference, Pp. 29-42, 2008.
- http://docplayer.net/14486352-Job-scheduling-with-the-fair-and-capacity-schedulers.html
- Geetha et al, “Hadoop Scheduler with Deadline Constraint”, International Journal on Cloud Computing: Services and Architecture, 4(5), 2014.
- A. Rasooli and D. Down “A Hybrid Scheduling APproach for Scalable Heterogeneous Hadoop Systems” SC Companion: High Performance Computing, Networking Storage and Analysis, Pp. 1284-1291, 2012.
- A. Patil et al, “Recent Job Scheduling Algorithms in Hadoop Cluster Environments: A Survey”, International Journal of Advanced Research in Computer and Communication Engineering, 4(2),2015.
- A. Patil et al, “workload analysis security aspects and optimization of workload in Hadoop clusters”, International Journal of Computer Engineering Technology, 6(3), 2015.
- G. Sasiniveda andN. Revathi,” Performance Tuning and scheduling of Large data set analysis in Map Reduce Paradigm by Optimal Configuration using Hadoop”, International Journal of Computer Applications, 70(21), 2013 . 19. Y. XIA et.al, “Research on Job Scheduling Algorithm in Hadoop”, Journal of Computational Information Systems,2011.
- http://hadoop.apache.org.
- Hadoop fair scheduler - http://hadoop.apache.org/common/docs/r0.20.1/fair_scheduler.html.
- B. Andrews and Binu, “Survey on Job Schedulers in Hadoop Cluster”, Journal of Computer Engineering, 15(1), Pp. 46-50, 2013.
- Kamal and K. Anyanwu, “Scheduling Hadoop Jobs to Meet Deadline”, Second International Conference on Cloud Computing technology and Science, Pp. 388-392, 2010.
Abstract Views: 207
PDF Views: 2