An Improved Method for Handling and Extracting useful Information from Big Data

N. Karthick; X. Agnes Kalarani

The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off

Abstract
References
Article Metrics
Refbacks

Objectives: The main objective of this method is to extract the meaningful information from the large amount of data and provide the aggregated form of output to the users. This work tends to mine the large volume of data which are gathered from the multiple sites in order to provide the useful information to the sites for improving their performance. Method: Data aggregation plays a most concerned role in the big data environment where it is very complex to extract useful information from large volume of data. In the existing work, computation based partitioning and aggregation (CP-A) is used to divide the big data into multiple partitions in which aggregation would be done. However existing works do not focus on the content similarity present between the set of data’s which might degrades the accuracy of aggregation result. To overcome this problem in the proposed research methodology, hybridized content and computation aware partitioning and aggregation (HCCP-A) method is introduced. Initially, this work would partition the big data into multiple partitions with the concern content and computation properties. After partitioning, data de-duplication technique would be applied to eliminate the repeated data’s that are present in every partition. This partitioning and data de-duplication process would be done in the mapper stage. The output from the mapper node would be parsed into aggregator node which will perform aggregation. Finally, aggregation result from the aggregator node would be fused together in the reducer node. Results: Hybridized content and computation aware partitioning and aggregation method is introduced in this work for extracting useful information from the large volume of data’s in the summarized format. This methodology is used to aggregate the large volume of data and after aggregation, result produced were compared with the existing methodology called CP-A in terms of performance metrics called the error rate, execution time and CPU utilization. The experimental tests were conducted and the performance has been made against the different number of data size. From this experimental testing it has been proved that the proposed methodology provides a better result than the existing methodology in terms of all performance measures. Conclusion: The finding demonstrates that the data aggregation using Hybridized content and computation aware partitioning and aggregation method is presented and this method has high accuracy and less error rate than the previous methodologies.

Keywords

Aggregation, Big Data, Data Duplication, Data Fusion Partitioning.

About the Journal

Editorial Board

Current Issue

Archives

Advanced Search

Article Submission

Registration

Subscription

User

Information

Journal Content
Browse

Donations

Username
Password
Remember me

Username
Password
Remember me

Indian Journal of Science and Technology

Keywords