Hadoop Mapreduce Performance Enhancement Using In-Node Combiners

Woo-Hyun Lee; Hee-Gook Jun; Hyoung-Joo Kim

The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off

Abstract
References
Article Metrics
Refbacks

While advanced analysis of large dataset is in high demand, data sizes have surpassed capabilities of conventional software and hardware. Hadoop framework distributes large datasets over multiple commodity servers and performs parallel computations. We discuss the I/O bottlenecks of Hadoop framework and propose methods for enhancing I/O performance. A proven approach is to cache data to maximize memory-locality of all map tasks. We introduce an approach to optimize I/O, the in-node combining design which extends the traditional combiner to a node level. The in-node combiner reduces the total number of intermediate results and curtail network traffic between mappers and reducers.

Keywords

Big Data, Hadoop, Map Reduce, NoSQL, Data Management.

I-Scholar

Journal Help

User

Notifications

Journal Content
Browse

Font Size

Information

Username
Password
Remember me

Username
Password
Remember me

AIRCC's International Journal of Computer Science and Information Technology

AIRCC's International Journal of Computer Science and Information Technology

Keywords