We believe in the power of a Data Warehouse!
“The report of my death has been grossly exaggerated.” – Mark Twain.
Have you heard “The Data Warehouse is dead”? This claim appears from time to time, usually accompanied by the latest next big thing in Data processing (Self Service BI, Big Data, Data Virtualization etc). However, what these claims fail to understand is a Data Warehouse transcends technology. At its core, it is a way of organising data to simplify the presentation of information for consumption by end-users. The star schema of data warehouses simplifies the relationships users need to navigate and is still what virtually every major BI tool eventually expects to be interacting with.
Over time the technologies, data platforms and data processing engines change, and each brings its own improvement, but the fundamental value of the data warehouse remains.
The benefits of a Data Warehouse include:
- The ability to integrate data from multiple sources to provide analysis across business domains. I.e. across financial, HR, operations, sales etc.
- A single version of the truth. One of the issues encountered when users report from operational systems is the inconsistencies they can create between reports using their own queries, formulas and definitions. Meetings become about measure definitions, instead of about strategy. A data warehouse provides consistent measures, periods, rollups, ranges, KPIs etc. across the business.
- Keep historical data and do an analysis of the past as it was in the past. For example, imagine a salesperson who works in region A. All the sales made by the salesperson roll up into the sales figures for region A. Imagine said salesperson moves from region A to region B. Now suddenly, if you are reporting directly from the operational system, the sales person’s past sales now roll up to the sales figures for region B. Undesirable. A Data Warehouse has methods of preventing this issue, keeping past sales associated with region A, and new sales are attributed to region B.
- End-User Productivity. In any organization, there is a subset of people who spend part, or all of their day producing information in one form or another. Typically they spend much of their time wrangling dirty data. One of the benefits of a data warehouse is end-user productivity. All that data manipulation is already done, and the users can concentrate on analysing and responding to information, rather than producing it.
- Eliminate Personnel Risk. Business logic is quite often locked up in spreadsheets or visualizations (Power BI, Tableau, Qlik etc) managed by individuals. If that person leaves, there is no one left to manage that spreadsheet(s). Centralizing your business logic (measures, KPIs etc.) means they can be managed and handed over in an orderly way.
- Remove load from operational systems. A single analytical query can cause major performance issues for an operational system. Offloading reporting and analysis to a data warehouse remove adverse impacts on operational systems.
- Complex measures and data augmentation. A Data Warehouse provides consistent data augmentation for reporting purposes that aren’t available in a source system. Amongst other things, a data warehouse provides analytics functions like relative period (e.g. MTD, YTD etc), the definition of acceptable ranges, targets, KPIs, aggregation or disaggregation of data, and periodic balances (e.g. end of month balances).
- A Data Warehouse can keep historical data beyond the normal retention period of operational systems.
A Data Warehouse at the speed of BI
“A universal truth is data is messy and someone, somewhere, somehow needs to deal with it.”
It’s tempting to adopt other approaches like self-service BI. However, a universal truth is data is messy and someone, somewhere, somehow needs to deal with it. Dealing with the mess in 100s of slightly different one-off solutions over the same source data (i.e. Self-Service BI, Spreadsheets etc) is a recipe for the chaos that data warehouses were first conceived of to solve!
“The problem is not with the concepts of a Data Warehouse, rather the difficulty in building and maintaining them”
The difficulty of building and maintaining a data warehouse is the problem that Dimodelo Data Architect addresses. By focussing on Architecture and Automation, it allows the Data engineering team to build sustainable data asset at the speed it would normally take to build self-service BI reports and analysis, without compromising on the issues of cohesion, history, risk and load.
We’re here for Data Warehouse developers!
“Unfortunately, many development teams don’t have the opportunity to build sustainable information assets for their businesses.”
Manually building and maintaining a data warehouse is complex. Developers manually create and manage 100s of individual data entities and load processes to deliver data solutions like data warehouses. As the data warehouse’ size increases, so does its complexity, and the team needed to manage it. Over time, these solutions can become unsustainable. The build-up of technical debt impairs a team’s ability to deliver insights to business in a timely, cohesive and cost-effective way. In response, this inertia sees the team adopt quick fix, one-off solutions to deliver BI, further exacerbating the issue.
“Architecture is an Asset, Code is Debt.”
In the data management industry, much of the focus is on the latest ETL tool or data platform to solve the difficult problems of Enterprise Data Management. Unfortunately, these tools don’t address the key issue of Data Warehousing… its maintainability.
Dimodelo is different. Dimodelo focuses on architecture and design. It is ETL and data platform agnostic. Dimodelo decouples design from implementation. With Dimodelo, instead of creating technical debt via code, data engineers build data assets through architecture and automation. Data Architects can properly direct technical teams and massively increase their productivity. These teams become truly agile reducing the ‘time to market’ of information and insights to business users. Data warehouses become sustainable living assets that are future-proofed with portability across clouds and platforms.
“Our focus is on helping Data Engineers rise above the chaos and deliver powerful insights to the business.”
Using Dimodelo a Data Architect defines a data management architecture (or uses a predefined one). Data engineers easily capture Data warehouse design in a tool that conforms to the data architecture. A code generation engine uses the design metadata and platform-specific generation templates to generate code for the targeted ETL and data platform. Code is automatically deployed updating the data platform and ETL code. ETL orchestration is managed by Dimodelo.