The Kimball method of Data Warehouse design describes a Data Warehouse Bus Matrix Architecture, and the concept of conformed dimensions. Put simply, the concept of conformed dimensions states that “Data Marts” within an Organisation should share common dimensions to enable cross Data Mart/Enterprise/Business process analysis.
In the following image are two data warehouses, Data Warehouse A (DWA) and Data Warehouse B (DWB). DWA could be another Dimodelo Data Warehouse Studio data warehouse or a data warehouse built outside of Dimodelo Data Warehouse Studio. DWA contains a Client Dimension. DWB wants to associate its facts with the Client dimension from DWA. To achieve that, DWB needs to stage the Client Dimension from DWA and replicate it in DWB, so it’s available for look up when creating DWB facts. However, it’s not necessary to replicate all the attributes of the Client dimension in DWA. Only those attributes required for the fact lookup are required.
The aim is to have a relationship between the DWB facts and the DWA client dimension via the Client surrogate key in the DWA Client dimension. Therefore it is important that the DWB replica of the DWA Client dimension retains the same surrogate keys.
The Dimension in DWA and the Fact in DWB are “brought back together” in the semantic layer. Because the foreign key to the Dimension in the DWB Fact uses the surrogate keys from the DWA Dimension, the join works seamlessly.
When conforming to a Dimension in another Data Warehouse, and that Dimension has type 2 attributes, then there will be a row for each version of each dimension member, meaning each business key may be duplicated across Dimension rows. In the first instance, its important that the replica Dimension contains exactly the same rows as the source Dimension, with the same surrogate keys and effective dates. To achieve this its important to follow the first approach described here.
In other cases, where the source Dimension is from another Data Warehouse that is not controlled by Dimodelo Data Warehouse Studio, and that other Dimension doesn’t contain surrogate keys, or effective dates, or doesn’t support type 2 attributes, then a it may not be possible to exactly replica the source Dimension. In this case the second approach described here applicable.
This approach is applicable if the source Dimension supports Surrogate keys, Effective Dates and soft Deletes (i.e. a deleted indicator). Dimodelo Data Warehouses fall into this category.
The first step is to bring the DWA Client Dimension into the staging layer of DWB.
To achieve this:
Use source dimension from DWA as the source of the Extract. The Row_Effective_Date from the source Dimension as the change identifier in the Incremental Extract pattern. This means only new rows added to the source Dimension in DWA are extracted and added to the Persistent Staging table in DWB.
Map the business key and surrogate key in mappings as normal.
In the Management Columns tab, map the Row_Effective_Date to the Row_Effective_Date of the Source dimension in DWA. This ensures that the effective dates of the source and conformed Dimension remain in sync.
Map the Row_Is_Deleted management column to the expression “CASE WHEN Row_Is_Current = 1 THEN 0 ELSE 1 END”. This ensures the Row_Is_Current flag on the Source Dimension in DWA and Conformed Dimension in DWB is kept is sync.
The second step is to define the DWB replica Dimension. It is important that the surrogate key name, data type and content matches the DWA dimension.
Create a replica Dimension. When defining the DWB Dimension attributes, only add the DWA Business Key as an attribute, and set it as the business key in DWB. Don’t add the surrogate key as an attribute.
Create a transform for the DWB dimension with the staged DWA dimension as its single primary staging table.
Match on business key.
Transform:
Map the surrogate key of the DWA dimension to the smart surrogate key (on the advanced tab) of the DWB dimension. Again – see the ‘Defining a Dimension with Smart Keys’ topic for more information about how to do that.
The Fact transform is exactly the same as any other fact transform. The Dimension will exist in DWB with the required surrogate keys and business keys. When you are defining the staging sources of the fact, you will need to have data that you can use to look up the dimension based on business key (in this example Client_ID). That may not always be straightforward, because the Client dimension has come from a different data warehouse and therefore likely a different data domain. In this case it may be necessary to stage additional attributes of the dimension to facilitate lookups by facts.
The two data warehouses ‘join back up’ in the semantic layer. When defining an SSAS Cube or tabular model add the Fact from the DWB and the Dimension from the DWA to the model. Associate them via the fact foreign key to dimension surrogate key. The fact will have derived its foreign key from the DWB Client dimension, which will have the same surrogate keys as the DWA Client dimension.
There is now a processing dependency between DWA and DWB. The DWA Client Dimension must be processed prior to the staging process in DWB executing. You can do this via workflows and jobs that are discussed further in the batch management lesson.
This approach applies if the source Dimension is not from a Dimodelo controlled Data Warehouse, and doesn’t support either:
In this scenario its not possible to exactly replicate the source dimension, but still, it will be possible to get a very close approximation. Much of the approach is the same except:
If the source Dimension is not from a Dimodelo controlled Data Warehouse, then the surrogate keys may use a different naming standard. The replica Dimension must use the same name for the surrogate key as in the source Dimension. The name of the surrogate key on the replica Dimension can be set on the advanced tab of the Dimension editor. E.g.