Open Access Open Access  Restricted Access Subscription Access

Query Optimization for Big Data Analytics


Affiliations
1 Seidenberg School of CSIS, Pace University, White Plains, New York, United States
 

Organizations adopt different databases for big data which is huge in volume and have different data models. Querying big data is challenging yet crucial for any business. The data warehouses traditionally built with On-line Transaction Processing (OLTP) centric technologies must be modernized to scale to the ever-growing demand of data. With rapid change in requirements it is important to have near real time response from the big data gathered so that business decisions needed to address new challenges can be made in a timely manner. The main focus of our research is to improve the performance of query execution for big data.

Keywords

Databases, Big data, Optimization, Analytical Query, Data Analysts and Data Scientists.
User
Notifications
Font Size

  • Duggan, J., Elmore, A. J., Stonebraker, M., Balazinska, M., Howe, B., Kepner, J., et al. (2015). The BigDAWG Polystore System. ACM Sigmod Record, 44(3)
  • V. Srinivasan and M. Carey. Performance of B-Tree Concurrency Control Algorithms. In Proc.ACM SIGMOD Conf., pages 416–425, 1991
  • A. Elmore, J. Duggan, M. Stonebraker, M. Balazinska, U. Cetintemel,V. Gadepally, J. Heer, B. Howe, J. Kepner, T. Kraskaet al., “A demonstration of the bigdawg polystore system,”Proceedings of theVLDB Endowment, vol. 8, no. 12, pp. 1908–1911, 2015
  • http://kylin.apache.org
  • D. Halperin et al. Demonstration of the myria big data management service. In SIGMOD, pages 881–884, 2014.
  • Fuad, A., Erwin, A. and Ipung, H.P., 2014, September. Processing performance on Apache Pig, Apache Hive and MySQL cluster. In Information, Communication Technology and System (ICTS), 2014 International Conference on (pp. 297-302). IEEE.
  • Liu, Shaosu, et al. "Kodiak: leveraging materialized views for very low-latency analytics over high-dimensional web-scale data." Proceedings of the VLDB Endowment9.13 (2016): 1269-1280
  • https://lens.apache.org/
  • https://calcite.apache.org/
  • Muniswamaiah, Manoj & Agerwala, Tilak & Tappert, Charles. (2019). Query Performance Optimization in Databases for Big Data. 85-90. 10.5121/csit.2019.90908.
  • https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page
  • Luke Welling, Laura Thomson, PHP and MySQL Web Development, Sams, Indianapolis, IN, 2001
  • https://www.splicemachine.com/
  • C. Bear, A. Lamb, and N. Tran. The vertica database: Sql rdbms for managing big data. In Proceedings of the 2012 workshop on Management of big data systems, pages 37–38.ACM, 2012
  • Cong Jin, Shuang Ran, "The research for storage scheme based on Hadoop", Computer and Communications (ICCC) 2015 IEEE International Conference on, pp. 62-66, 2015.

Abstract Views: 392

PDF Views: 162




  • Query Optimization for Big Data Analytics

Abstract Views: 392  |  PDF Views: 162

Authors

Manoj Muniswamaiah
Seidenberg School of CSIS, Pace University, White Plains, New York, United States
Tilak Agerwala
Seidenberg School of CSIS, Pace University, White Plains, New York, United States
Charles Tappert
Seidenberg School of CSIS, Pace University, White Plains, New York, United States

Abstract


Organizations adopt different databases for big data which is huge in volume and have different data models. Querying big data is challenging yet crucial for any business. The data warehouses traditionally built with On-line Transaction Processing (OLTP) centric technologies must be modernized to scale to the ever-growing demand of data. With rapid change in requirements it is important to have near real time response from the big data gathered so that business decisions needed to address new challenges can be made in a timely manner. The main focus of our research is to improve the performance of query execution for big data.

Keywords


Databases, Big data, Optimization, Analytical Query, Data Analysts and Data Scientists.

References