Open Access
Subscription Access
Open Access
Subscription Access
A Survey on the Evolution of Models of Data Integration
Subscribe/Renew Journal
From time to time there have been different models of data integration to manage and analyze data. Also with the emergence of big data, the database community has proposed newer and better solutions to manage such disparate and large data. Also, the changes in the data storage models and massive data repositories on the web have encouraged the need for novel data integration models. In this article, we try to present a case of various trends in integrating data through different models. We present a brief overview of Federated Database Systems, Data Warehouse, Mediators and new proposed Polystore Systems with the evolution of architecture, query processing, distribution, automation and data models supported within those data integration models. The similarities and differences of these models are also presented. Also, the novelty of Polystore Systems with various examples is discussed. This article also highlights the importance of such system for integrating large scale heterogeneous data
Keywords
Data Integration, Multi-database Systems, Polystore Systems
Subscription
Login to verify subscription
User
Font Size
Information
- M. Ceriani, and P. Bottoni, “A dataflow platform for applications based on linked data,” International Journal of Computational Science and Engineering, vol. 16, no. 4, pp. 419-429, 2018.
- C. R. Musick, T. Critchlow, M. Ganesh, T. Slezak, and K. Fidelis, “System and method for integrating and accessing multiple data sources within a data warehouse architecture,” U.S. Patent No. 7,152,070, Dec. 19, 2006.
- A. P. Sheth, and J. A. Larson, “Federated database systems for managing distributed, heterogeneous, and autonomous databases,” ACM Computing Surveys, vol. 22, no. 3, pp. 183-236, 1990.
- S. Suwanmanee, et al., “Wrapping and integrating heterogeneous databases with OWL,” 7th International Conference on Enterprise Information Systems (ICIES 2005), 2005.
- V. Gadepally, P. Chen, J. Duggan, A. Elmore, B. Haynes, ......, and M. Stonebraker, “The BigDAWG polystore system and architecture,” 2016 IEEE High Performance Extreme Computing Conference (HPEC), IEEE, Waltham, MA, USA, Sep. 13-15, 2016.
- M. Stonebraker, and U. Çetintemel, ““One size fits all”: An idea whose time has come and gone,” Making Databases Work: The Pragmatic Wisdom of Michael Stonebraker, 2018, pp. 441-462.
- Z. She, S. Ravishankar, and J. Duggan, “BigDAWG polystore query optimization through semantic equivalences,” 2016 IEEE High Performance Extreme Computing Conference (HPEC), IEEE, Waltham, MA, USA, Sep. 13-15, 2016.
- D. L. Moody, and M. A. R. Kortink, “From enterprise models to dimensional models: A methodology for data warehouse and data mart design,” Proceedings of the International Workshop on Design and Management of Data Warehouses (DMDW’2000), Stockholm, Sweden, Jun. 5-6, 2000.
- S. Chaudhuri, and U. Dayal, “An overview of data warehousing and OLAP technology,” ACM Sigmod Record, vol. 26, no. 1, pp. 65-74, 1997.
- G. J. L. Kemp, N. Angelopoulos, and P. M. D. Gray, “Architecture of a mediator for a bioinformatics database federation,” IEEE Transactions on Information Technology in Biomedicine, vol. 6, no. 2, pp. 116-122, 2002.
- J. Duggan, A. J. Elmore, M. Stonebraker, M. Balazinska, B. Howe, ..., and S. Z. Brown, “The BigDAWG polystore system,” ACM Sigmod Record, vol. 44, no. 2, pp. 11-16, 2015.
- Mohd. Saeed, M. Villarroel, A. T. Reisner, G. Clifford, L.-W. Lehman, ....., and R. G. Mark, “Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II): A public-access intensive care unit database,” Critical Care Medicine, vol. 39, no. 5, pp. 952-960, 2011.
- M. Armbrust, et al., “Spark SQL: Relational data processing in spark,” Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, 2015.
- M. Zaharia, R. S. Xin, P. Wendell, T. Das, M. Armbrust, ….., and I. Stoica, “Apache spark: A unified engine for big data processing,” Communications of the ACM, vol. 59, no. 11, pp. 56-65, 2016.
- D. J. DeWitt, A. Halverson, R. Nehme, S. Shankar, ....., and J. Gramling, “Split query processing in polybase,” Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, 2013.
- B. Kolev, P. Valduriez, C. Bondiombouy, R. Jimenez-Peris, R. Pau, and J. Pereira, “CloudMdsQL: Querying heterogeneous cloud data stores with a common language,” Distributed and Parallel Databases, vol. 34, no. 4, pp. 463-503, 2016.
Abstract Views: 217
PDF Views: 0