Create a Dimension Transform
A Dimension Transform specifies the mapping for the ETL which will load the Dimension.
- Select the Add button on the ETL tab. The ‘Transform & Load’ dialog appears.
Overview Tab
- In the Transformation Name field enter “Task“. The Name is used to uniquely identify the Transform and is used by the standard generation templates to name the Transform stored procedure. The name must be unique across all Dimensions.
- In the ETL Pattern field select “Type_1_and_2_Dimension“. There is only one standard pattern. The “Type 1 and 2 Dimension” pattern caters for both Type 1 and Type 2 Slowly changing dimension attributes in the same dimension.
- In the Transform Overview enter “This is the Task Dimension Transform“. The Transform Overview is a description of the Transform for documentation purposes.
Staging Sources
Defines the Staging tables and columns that are the source of the transform.
The staging source can be a single primary staging table or the combination of a primary staging table and multiple secondary staging tables. The primary Staging table should be the staging table that has the same grain as your destination Dimension table.
Primary Staging Table
- In the Primary Staging Table dropdown select the “Task” staging table. The Dimension will have the same grain as the primary source. The Staging Table Column List is automatically populated with all the fields from the Primary Staging table. Delete any columns that are not required.
Further information
- Edit Button. Click this button to specify further details for the Primary Staging Table. The Source Staging Table dialog appears.
- Select the Primary Staging Table columns that are required for the transform.
- Filter. In some cases, you may want to filter the data returned for the primary Staging table in the Transform process. Define the Filter, using an Expression. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name. See the Expression Language section for details of how to refer to Staging and Warehouse tables and columns. The filter is applied in the Where clause of the generated Source Query.
- Tag Table. Define custom metadata for the Primary Staging table.
Secondary Staging Tables
If data is required from another Staging table in the transform, define the table in the Secondary Staging Tables table, and how it joins to the primary (or other secondary).
- Add Join Button. Click this button to specify details for the Secondary Staging Table. The Source Staging Table dialog appears.
- Staging Table Name. Select the Secondary Staging table from the drop-down.
- Join On. Write an expression describing how the Secondary table joins to either the Primary Staging table, or other Secondary Staging tables. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name.See the Expression Language section for details of how to refer to Staging and Warehouse tables and columns. The Join on Expression is used in the LEFT JOIN ON clause of the generated Source Query. The Generated Source query is generated as a LEFT JOIN to the Secondary Staging table. If you require a JOIN, then use a ‘Secondary Staging table key IS NOT NULL’ filter, to eliminate rows where there is no matching Secondary Staging table record.
- Filter. In some cases, you may want to filter the data returned for the Secondary Staging table in the Transform process. Define the Filter, using an Expression. The filter is added to the Where clause of the generated Source query. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name. See the Expression Language section for details of how to refer to Staging and Warehouse tables and columns.
- Tag Table. Define custom metadata for the Primary Staging table.
Match On
The Match On tab defines the columns of the Staging source tables that match the business key(s) of the target Fact/Dimension for the purposes of identifying new and deleted source records. In the standard generated code, where a match is found, a further comparison is made based on checksums, to determine if a change has been made to the target row.
There are 2 ways to define the match:
- Use the Match button below the bottom right of the table to automatically match Staging to Target Business key based on Name.
- Select the Match manually.
- Staging Table. Select the Staging table that contains the Staging column.
- Staging Column. Select the matching Staging Column.
- … Button. Click the … button if it is necessary to define an expression on columns of the Staging source to match to the target business key. The expression language is T-SQL. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name. See the Expression Language section for details of how to refer to Staging and Warehouse tables and columns. The expression should be written as if it were in the SELECT list of a T-SQL Query against the Staging database. E.g. Select expression FROM Staging table.
- Filter By. In rare cases, perhaps when you have defined multiple transforms on a Dimension, you may need to filter the Dimension table, to match the incoming Staging rows for that transformation. You can define the filter, by specifying a Where clause on the Dimension table. The Filter is a T-SQL expression written as if it were part of a Where clause e.g. SELECT * FROM Dimension WHERE Expression. Reference Dimension table columns by Name.
Transformation Tab
The transformation tab allows you to map Staging table columns to Attributes during the transformation.
There are 2 ways to define the mapping:
- Use the Auto Match link below the bottom right of the table to automatically match Staging to Target Attributes based on Name.
- Select the Mapping manually.
- Double click a row or click Edit Mapping button to change the mapping for the column, define an expression mapping, and add a description to the mapping for documentation purposes. Click the button and the Transform Mapping dialog appears.
- Staging Column Mapping. Select the Staging table and Staging Column mapping for the Target Attribute from the dropdowns.
- Write an expression for the mapping. Note the expression can only reference Staging columns. The expression language is T-SQL. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name. See the Expression Language section for details of how to refer to Staging and Warehouse tables and columns. The expression should be written as if it were in the SELECT list of a T-SQL Query against the Staging database. E.g. SELECT expression FROM Staging_Table.
- Transform Mapping Description. A description of the mapping for documentation purposes. Non Mandatory.
- Transform Mapping Tags. Define custom metadata for the mapping.
Advanced Tab
- Transform Tags. Define custom Design/Metadata for the Transform.
Auto Generate Key. If the surrogate key for the Dimension is a ‘smart’ key (like 20100616 for a Calendar dimension), select false, and then specify the source Staging column that contains the ‘smart’ key. See the
- Defining a Dimension with Smart Keys section for more information.
Multiple Transforms
A Dimension typically only has one Transform but can have multiple. You may want to define multiple transforms if you want to load rows from multiple source tables. Effectively a UNION of data from multiple source staging tables rather than a JOIN.
When you define multiple Transforms you must define the filter by condition for the Dimension, so each Transform only operates on the subset of data in the Dimension it is responsible for updating. Typically you would add a “SourceSystemId” column on Dimension, and use that to differentiate data from one staging source or the other.
You need to filter the Dimension table, to only match the incoming Staging rows of that transformation. You can define the filter, by specifying a value in the “Optionally filter the Dimension” field on the ‘Match On’ tab of the Transform and Load dialog. The Filter is a T-SQL expression written as if it were part of a Where clause e.g. SELECT * FROM Dimension WHERE Expression. Fully qualify references to the dimension table columns. E.g. dim.Dim_Table.ColumnName.
E.g. If you had defined a SourceSystemId attribute on your Dimension use a filter like this:
- dim.dim_Table.SourceSystemId = ‘CC’
Where ‘CC’ is the value you are mapping to the SourceSystemId in this Transform. In another Transform, it might be “PC’ or ‘AR’ or 1,2,3 or whatever value you use by convention to identify that source system.