Open Access Open Access  Restricted Access Subscription Access

Cloak-Reduce Load Balancing Strategy for Mapreduce


Affiliations
1 Department of Mathematics and Computer Science, Nazi Boni University, Bobo-Dioulasso, Burkina Faso
 

The advent of Big Data has seen the emergence of new processing and storage challenges. These challenges are often solved by distributed processing.

Distributed systems are inherently dynamic and unstable, so it is realistic to expect that some resources will fail during use. Load balancing and task scheduling is an important step in determining the performance of parallel applications. Hence the need to design load balancing algorithms adapted to grid computing.

In this paper, we propose a dynamic and hierarchical load balancing strategy at two levels: Intra-scheduler load balancing, in order to avoid the use of the large-scale communication network, and inter-scheduler load balancing, for a load regulation of our whole system. The strategy allows improving the average response time of CLOAK-Reduce application tasks with minimal communication.

We first focus on the three performance indicators, namely response time, process latency and running time of MapReduce tasks.


Keywords

Big Data, Distributed Processing, Load Balancing, CLOAK-Reduce, Task Allocation.
User
Notifications
Font Size

  • Neha Verma, Dheeraj Malhotra, et Jatinder Singh (2020) “Big data analytics for retail industry using MapReduce-Apriori framework”, Journal of Management Analytics, p. 1-19.
  • Telesphore Tiendrebeogo, Daouda Ahmat, and Damien Magoni, (2014) “Évaluation de la fiabilité d’une table de hachage distribuée construite dans un plan hyperbolique”, Technique et Science Informatique, TSI, Volume 33 - n◦ 4/2014, Lavoisier, pages 311–341.
  • Telesphore Tiendrebeogo and Damien Magoni, (2015) “Virtual and consistent hyperbolic tree: A new structure for distributed database management”. In International Conference on Networked Systems, pages 411–425. Springer, 2015.
  • Tiendrebeogo, Telesphore, (2015) “A New Spatial Database Management System Using a Hyperbolic Tree”, DBKDA 2015: 53.
  • Telesphore Tiendrebeogo, Mamadou Diarra, (2020) “Big Data Storage System Based on a Distributed Hash Tables system” International Journal of Database Management Systems (IJDMS) Vol.12, No.4/5.
  • Jeffrey Dean and Sanjay Ghemawat, (2008) “MapReduce simplified data processing on large clusters”, Communications of the ACM, 51(1) :107–113, 2008.
  • Than Than Htay and Sabai Phyu, (2020) “Improving the performance of Hadoop MapReduce Applications via Optimization of concurrent containers per Node”, In: 2020 IEEE Conference on Computer Applications (ICCA). IEEE, p. 1-5.
  • Dino, Hivi Ismat, et al. "Impact of load sharing on performance of distributed systems computations." International Journal of Multidisciplinary Research and Publications (IJMRAP) 3.1 (2020): 30-37.
  • Gopi, Arepalli Peda, V. Lakshman Narayana, and N. Ashok Kumar., (2018). "Dynamic load balancing for client server assignment in distributed system using genetical gorithm." Ingénierie des Systèmes d’Information 23.6.
  • Li, Yun, et al. "Big data and cloud computing." Manual of Digital Earth. Springer, Singapore, 2020. 325-355.
  • Ian FOSTER., (2005) “Globus toolkit version 4: Software for service-oriented systems”. In Network and parallel computing. Springer, p. 2–13.
  • Ion Stoica, Robert Morris, David Liben-Nowell, David R Karger, M Frans Kaashoek, Frank Dabek, and Hari Balakrishnan., (2003) “Chord: a scalable peer-to-peer lookup protocol for internet applications”. IEEE/ACM Transactions on Networking (TON), 11(1) :17–32.
  • Nicholas JA Harvey, John Dunagan, Mike Jones, Stefan Saroiu, Marvin Theimer, and Alec Wolman, (2002) “Skipnet: A scalable overlay network with practical locality properties”
  • Antony Rowstron and Peter Druschel., (2001) “Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems”. In IFIP / ACM International Conference on Distributed Systems Platforms and Open Distributed Processing, pages 329–350. Springer.
  • Petar Maymounkov and David Mazieres. Kademlia, (2002) “A peer-to-peer information system based on the xor metric”. In International Workshop on Peer-to-Peer Systems, pages 53–65. Springer.
  • Ping, Y. (2020). Load balancing algorithms for big data flow classification based on heterogeneous computing in software definition networks. Journal of Grid Computing, 1-17.
  • Bhushan, K. (2020). Load Balancing in Cloud Through Task Scheduling. In Recent Trends in Communication and Intelligent Systems (pp. 195-204). Springer, Singapore.
  • Gao, X., Liu, R., & Kaushik, A. (2020). Hierarchical multi-agent optimization for resource allocation in cloud computing. IEEE Transactions on Parallel and Distributed Systems, 32(3), 692-707.
  • Abdalkafor, A. S., Jihad, A. A., & Allawi, E. T. (2021). A cloud computing scheduling and its evolutionary approaches. Indonesian Journal of Electrical Engineering and Computer Science, 21(1), 489-496.
  • Ebadifard, F., Babamir, S. M., & Barani, S. (2020, April). A dynamic task scheduling algorithm improved by load balancing in cloud computing. In 2020 6th International Conference on Web Research (ICWR) (pp. 177-183). IEEE.
  • Ghomi, E. J., Rahmani, A. M., & Qader, N. N. (2017). Load-balancing algorithms in cloud computing: A survey. Journal of Network and Computer Applications, 88, 50-71.
  • Belayadi, D., Hidouci, K. W., Bellatreche, L., & Ordonez, C. (2018). Équilibrage de Distribution de Données d’une Base en Mémoire Parallèle Partitionnées par Intervalle. Business Intelligence & Big Data.
  • BAERT, Quentin, CARON, Anne-Cécile, MORGE, Maxime, et al., (2019) “Stratégie situationnelle pour l’équilibrage de charge.
  • Neghabi, A. A., Navimipour, N. J., Hosseinzadeh, M., & Rezaee, A. (2018). Load balancing mechanisms in the software defined networks: a systematic and comprehensive review of the literature. IEEE Access, 6, 14159-14178.
  • S.Banerjee , J.P. Hecker, (2015) “Multi-Agent System Approach to Load-Balancing and Resource Allocation for Distributed Computing”, First Complex Systems Digital Campus World EConference .
  • Yahya Hassanzadeh-Nazarabadi, Alptekin Küpçü, et Öznur Özkasap., (2018) “Decentralized and locality aware replication method for DHT-based P2P storage systems. Future Generation Computer Systems”, vol. 84, p. 32-46.
  • Surati, S., Jinwala, D. C., & Garg, S. (2017). A survey of simulators for P2P overlay networks with a case study of the P2P tree overlay using an event-driven simulator. Engineering Science and Technology, an International Journal, 20(2), 705-720.

Abstract Views: 170

PDF Views: 108




  • Cloak-Reduce Load Balancing Strategy for Mapreduce

Abstract Views: 170  |  PDF Views: 108

Authors

Mamadou Diarra
Department of Mathematics and Computer Science, Nazi Boni University, Bobo-Dioulasso, Burkina Faso
Telesphore Tiendrebeogo
Department of Mathematics and Computer Science, Nazi Boni University, Bobo-Dioulasso, Burkina Faso

Abstract


The advent of Big Data has seen the emergence of new processing and storage challenges. These challenges are often solved by distributed processing.

Distributed systems are inherently dynamic and unstable, so it is realistic to expect that some resources will fail during use. Load balancing and task scheduling is an important step in determining the performance of parallel applications. Hence the need to design load balancing algorithms adapted to grid computing.

In this paper, we propose a dynamic and hierarchical load balancing strategy at two levels: Intra-scheduler load balancing, in order to avoid the use of the large-scale communication network, and inter-scheduler load balancing, for a load regulation of our whole system. The strategy allows improving the average response time of CLOAK-Reduce application tasks with minimal communication.

We first focus on the three performance indicators, namely response time, process latency and running time of MapReduce tasks.


Keywords


Big Data, Distributed Processing, Load Balancing, CLOAK-Reduce, Task Allocation.

References