Enhancing HiveQL Engine Using Map-Join-Reduce

Amruta Kulkarni; Shweta C. Dharmadhikari; M. Emmanuel

Enhancing HiveQL Engine Using Map-Join-Reduce

Amruta Kulkarni ¹, Shweta C. Dharmadhikari ², M. Emmanuel ³

Affiliations
1 Pune Institute of Computer Technology College, Pune, Maharashtra, India
2 Department of Information Technology, PICT, Pune, India
3 PICT College of Engineering, Pune, India

Hive is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.
This HiveQL is allowing enhancement of MapReduce to MapJoinReduce for our convenience. This will lead us for detailed study of performance improvement.
The programmer is only required to write specialized map and reduce functions as part of the Map/Reduce job. Framework takes care of the rest. But MapReduce finds performance issue. The performance issue is mainly due to MapReduce sequential data processing strategy which frequently checkpoints and shuffles intermediate results in data processing. So MapReduce can be improved to increase scalability and efficiency.
And proposed solution is Map-Join-Reduce. Map-Join-Reduce remove the burden of presenting complex join algorithms to the system. We first proposed filter-join-aggregate mathematical model which is an extension of MapReduce model. To support this mathematical model we present a MapJoinReduce architecture design for HiveQL engine. This architecture design will put light on strategy of query processing by Hive system and Hadoop system.
Benefit of this approach is minimized check pointing and shuffling of intermediate result and further more improves performance of system.

Keywords

CPU and Memory Analysis, Hadoop, HiveQL.

I-Scholar

Journal Help

User

Subscription Login to verify subscription

Notifications

Journal Content
Browse

Font Size

Information

Abstract Views: 226

PDF Views: 3

Enhancing HiveQL Engine Using Map-Join-Reduce

Abstract Views: 226 | PDF Views: 3

Authors

Amruta Kulkarni
Pune Institute of Computer Technology College, Pune, Maharashtra, India

Shweta C. Dharmadhikari
Department of Information Technology, PICT, Pune, India

M. Emmanuel
PICT College of Engineering, Pune, India

Abstract

Keywords

CPU and Memory Analysis, Hadoop, HiveQL.

Username
Password
Remember me

Username
Password
Remember me

Data Mining and Knowledge Engineering

Data Mining and Knowledge Engineering

Enhancing HiveQL Engine Using Map-Join-Reduce

Subscribe/Renew Journal

Keywords

Enhancing HiveQL Engine Using Map-Join-Reduce

Authors

Abstract

Keywords