QoS-Aware Data Replication in Hadoop Distributed File System
Cloud computing provides services using virtualized resources through Internet on pay per use basis. These services are delivered from millions of data centers which are connected with each other. Cloud system consists of commodity machines. The client data is stored on these machines. Probability of hardware failure and data corruption of these low performance machines are high. For fault tolerance and improving the reliability of the cloud system the data is replicated to multiple systems.
Hadoop Distributed File System (HDFS) is used for distributed storage in cloud system. The data is stored in the form of fixed-size blocks i.e. 64MB. The data stored in HDFS is replicated on multiple systems for improving the reliability of the cloud system. Block replica placement algorithm is used in HDFS for replicating the data block. In this algorithm, QoS parameter for replicating the data block is not specified between client and service provider in the form of service level agreement.
In this paper, an algorithm QoS-Aware Data Replication in HDFS is suggested which considers the QoS parameter for replicating the data block. The QoS parameter considered is expected replication time of application. The block of data is replicated to remote rack DataNodes which satisfies replication time requirement of application. This algorithm reduces the replication cost as compared to existing algorithm thus, improving the reliability and performance of system.
Keywords
Abstract Views: 194
PDF Views: 0