Open Access
Subscription Access
Pre-Requisites of Big Data and Hadoop
Data is increasing at a speed that is even difficult to capture in today's world as a result of regular uploading of data to the internet. Social networking websites are the main sources for the collection of this huge amount of data. This collection of online data happens at a great pace. "Big Data" is a collection of data sets that is extremely large in numbers highlighting its size; it varies from one another i.e. different varieties of data; and velocity of collection of data i.e. its speed. This extreme amount of data is difficult to handle by traditional software packages. Hence it becomes necessary to introduce a new and efficient method to handle this huge data and then came into existence is Hadoop. Hadoop is an open source project that enables separation of processing of different datasets using simple programming model along with high rate of fault tolerance. Hadoop consists of two main parts: HDFS for storing large amount of data and MapReduce for analysis of data. It becomes easy to handle even petabytes or zettabytes of data with the help of this Java based software i.e. Hadoop.
User
Font Size
Information
Abstract Views: 159
PDF Views: 5