A data warehouse integrates data from various and heterogeneous data sources and creates a consolidated view of the data that is optimized for reporting and analysis. Today, business and technology are constantly evolving, which directly affects the data sources. New data sources can emerge while some can become unavailable. The DW or the data mart that is based on these data sources needs to reflect these changes. Various solutions to adapt a data warehouse after the changes in the data sources and the business requirements have been proposed in the literature [1]. However, research in the problem of DW evolution has focused mainly on managing changes in the dimensional model while other aspects related to the ETL, and maintaining the history of changes has not been addressed. The paper presents a Meta Data vault model that includes a data vault based data warehouse and a master data management. A major area of focus in this research is to keep both history of changes and a "single version of the truth," through an MDM, integrated with the DW. The paper also outlines the load patterns used to load data into the data warehouse and materialized views to deliver data to end-users. To test the proposed model, we have used big data sets from the biomedical field and for each modification of the data source schema, we outline the changes that need to be made to the EDW, the data marts and the ETL.
Keywords
Data Warehouse (DW), Enterprise Data Warehouse (EDW), Business Intelligence, Data Vault (DV), Business Data Vault, Master Data Vault, Master Data Management (MDM), Data Mart, Materialized View, Schema Evolution, Data Warehouse Evolution, ETL, Metadata Repository, Relational Database Management System (RDMS), NoSQL.
User
Font Size
Information