- Data Warehouse
Data warehouse is also known as Enterprise Data Warehouse (EDW). Data warehouse is used as source for Business Intelligent’s reporting and analysis. Data Warehouse system collects data from multiple sources and contains historical data for trend analysis reporting. ETL tool is used mostly to build Data Warehouse and interfaces around it. Data Warehouse acts as Single Version of truth.
2. Operational Data Store (ODS)
Operational Data Store is frequently confused and definition is overlapped with Data Warehouse. Some of my clients had used word ODS instead of Data Warehouse, which got me confused on number of occasion. As per my understanding & research, ODS is used to integrate data from multiple systems and feed it to Data Warehouse. Data Warehouse consist of complete history of data, whereas ODS contains latest or recent data (short window of data). Data load frequency in ODS is mostly hourly whereas data load frequency in Data Warehouse mostly is nightly because of data volume. Most important reason to have ODS in your company is ability to run report realtime, where source system does not have required reporting capabilities.
3. Data Mart
Data warehouse can contain many Data Marts. Mostly Data mart is created per business line or system that needs data from Data Warehouse. Indirectly we can say, Data Mart is access layer used to get data out of Data Warehouse by other systems.
4. Data Lake
Term Data Lake was coined by James Dixon, CTO of Pentaho to compare with Data Mart. As per James, Data Mart have several problems mostly related to data silos. Data Lake is method of storing data from sources in its actual or raw format that could be Relational Data, XML, flat files or even binary files. Other tools like ETL, access Data Lake as per need for reporting or analysis purposes.