The following video demonstrates how to create a Fact table in Dimodelo Data Warehouse Studio.
Creating a Fact Table
To create a Fact table, right click the ‘Fact Tables’ folder (or one of its sub folders) and select Create New.
Fact Table Tab
- Fact Table Name. The Fact Table Name is the name of your table in your data warehouse relational database. Spaces in the name will be converted to _ automatically when you save. The prefix for the table name can be set in the configuration utility.
- Fact Friendly Name. This is the Measure Group Name the user would see for the Fact Table through a Cube browser. Spaces are not replaced in this name.
- Fact Table Description. A description of the contents of the Fact Table. Optional. This is used when documentation is generated.
- Reference. This is a free form text you can use to reference the dimension design back to another design document, perhaps a business requirements document.
Fact Table Measures Tab
Import Schema
To Import the Meta Data.
- On the Fact Table Measures tab, click the Import Schema.
- Select the Source where the source table resides.
- Select the Table and press OK.
The Fact Table Measures will be populated with the column names and data types from the source table. Some connection types don’t return column data types.
By default the Staging database is available as a source. However, the Staging database needs to be generated and deployed, before the staging tables are available for import. Also be careful to delete staging management columns that are imported. E.g. surrogate key, reject indicator, batch execution id etc. They may conflict with the standard management columns of the fact.
Define Measures
On this tab you need to define both the Business Keys (Natural Keys) and Measures of the Fact. All other columns that may have been imported from a Source table should be deleted. Columns used to lookup dimension don’t need to be listed in this table.
- Measure Name. Enter the measure name. Spaces are replaced with _ when the Fact Table is saved. The Name must be unique within the set of Measures for the Fact.
- Data Type. Select a data type from the dropdown list, or just type the data type. The data type will be validated to ensure it is a valid SQL Server data type when the data type field loses focus. If the data type is invalid, a dialog box will display, and the field cleared.
- Description. Add a description of the measure. Generally the description should make sense to an end user. Descriptions are used in Data Dictionary generation.
- This is a free form text you can use to reference the measure back to another design document, or perhaps a report the measure appears on.
- Business Key. Tick the row(s) that represent the Fact Table’s business/natural key. This is used in the definition of the Transform/Load process for the table.
To Delete a Measure
Select the measure row and tap the delete key, or click the Delete Attribute(s) button. You can select multiple rows.
To Change a Measure
You can change the name, data type and description of a measure.
Associating Dimensions
In order to complete the star schema of the fact table you must associate it with one or more of the dimensions in the project. To associate a Fact Table to a dimension, use the Dimensionality tab of the Fact Table.
Select the dimension you want to associate to the fact table in the drop down list in the Dimension column. Click on another cell to ensure the association is recognized. If you intend to associate the dimension as a role play dimension, then give it a role play name. A dimension can be associated to a fact table multiple times, but only once without a role play name. That is, you can associate a dimension to a fact without a role play name once, and every other association must have a role play name and each role play name must be different.
Custom Management Columns
What are Management Columns? Management columns are columns that are added to your Fact and Dimension tables to help manage the data, during ETL. Management Columns include things like:
- Batch Identifier.
- Row Status (Current – Not Current).
- Row Effective Dates.
- Checksum (if you use checksums in your ETL)
Dimodelo Data Warehouse Studio allows you to specify Custom management columns on an individual Fact/Dimension basis. A set of Standard Management Columns is specified for all Fact/Dimension tables. See the Standard Management Columns section for an explanation of Standard Management Columns.
Custom Management columns can be used for Fact and Dimensions that have custom ETL that requires additional management columns for its operation.
Custom Management Columns can be added, changed and deleted through the Management Columns tab > Custom Management Columns table in the Fact and Dimensions editor.
Defining Fact Transforms
To define a Transform & Load process open the Fact document of the Fact table, and select the Add button on the ETL tab. The Transform & Load dialog appears.
Overview Tab
- Transformation Name. The transformation name will default to the name of the Fact table. The Name is used to uniquely identify the Transform and is used by the standard generation templates to name the Transform stored procedure. The name must be unique across all Dimensions and Facts.
- ETL Pattern. Select the code pattern that will be used as the basis for the generated code for this Transform. Patterns are discussed further in a later topic.
- Transform Overview. A description of the Transform for documentation purposes.
Staging Sources
Defines the Staging tables and columns that are the source of the transform.
The staging source can be a single primary staging table or the combination of a primary staging table and multiple secondary staging tables. The primary Staging table should be the staging table that has the same grain as your destination Dimension table.
Primary Staging Table
- Primary Staging Table. Select the Staging table that will be the primary source for your transform. The primary source set the grain of the destination Fact. There will be one row in the fact for each row in the primary staging table.
- Staging Table Column List. This list is automatically populated with all the fields from the Primary Staging table. Delete any columns that are not required.
- Edit Button. Click this button to specify further details for the Primary Staging Table. The Source Staging Table dialog appears.
- Select the Primary Staging Table columns that are required for the transform.
- Filter. In some cases, you may want to filter the data returned for the primary Staging table in the Transform process. Define the Filter, using an Expression. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name. The filter is applied in the Where clause of the generated Source Query.
- Tag Table. Define custom meta data for the Primary Staging table.
Secondary Staging Tables
If data is required from another Staging table in the transform, define the table in the Secondary Staging Tables table, and how it joins to the primary (or other secondary).
- Add Join Button. Click this button to specify details for the Secondary Staging Table. The Source Staging Table dialog appears.
- Staging Table Name. Select the Secondary Staging table from the drop down.
- Join On. Write an expression describing how the Secondary table joins to either the Primary Staging table, or other Secondary Staging tables. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name. The Join on Expression is used in the LEFT JOIN ON clause of the generated Source Query. The Generated Source query is generated as a LEFT JOIN to the Secondary Staging table. If you require a JOIN, then use a ‘Secondary Staging table key IS NOT NULL’ filter, to eliminate rows where there is no matching Secondary Staging table record.
- Filter. In some cases, you may want to filter the data returned for the Secondary Staging table in the Transform process. Define the Filter, using an Expression. The filter is added to the Where clause of the generated Source query. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name.
- Tag Table. Define custom meta data for the Secondary Staging table.
Match On
The Match On tab defines the columns of the Staging source tables that match to the business key(s) of the target Fact/Dimension for the purposes of identifying new and deleted source records. In the standard generated code, where a match is found, further comparison is made based on checksums, to determine if a change has been made to the target row.
There are 2 ways to define the match:
- Use the Match button below the bottom right of the table to automatically match Staging to Target Business key based on Name.
- Select the Match manually.
- Staging Table. Select the Staging table that contains the Staging column.
- Staging Column. Select the matching Staging Column.
- … Button. Click the … button if it is necessary to define an expression on columns of the Staging source to match to the target business key. The expression language is SSIS. An SSIS expression editor dialog appears. You can drag and drop column names and functions etc.
- Filter By. In Rare cases, perhaps when you have defined multiple transforms on a Dimension, you may need to filter the Fact table, to match the incoming Staging rows for that transformation. You can define the filter, by specifying a Where clause on the Fact table. The Filter is a T-SQL expression written as if it were part of a Where clause e.g. SELECT * FROM Fact WHERE Expression.
Dimension Lookups
Use this tab to define how the Transform should look-up the corresponding member in the associated Dimension based on Staging columns.
- Double click a row or click the Edit Mapping Button to open the Dimension Lookup.
There are 2 ways to define the lookup using a where statement.
- Use the match button at the bottom right of the table to automatically match the Staging to target Dimension business key columns.
- Define the match manually.
- Business Key. The Business Key column is populated from the list of business keys for the Dimension.
- Staging Table. Select the Staging Table that contains the column that maps to the Dimension Business Key.
- Staging Column. Select the Staging column that maps to the Dimension Business Key.
- Expression and … Button. Define an expression using Staging columns that maps to the Dimension Business Key. The expression language is the SSIS Expression Language.
- Join On. In some cases it is necessary to define a Join On expression to join the Staging columns to the Dimension table columns. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name.
- Advanced Tab
- Filter. In some cases you might want to filter the Dimension prior to applying the lookup. Define the where clause for filtering here. When referencing Staging columns in a T-SQL Expression, use the following convention Source_System_Abbrev_ Table_Name.Column_Name.
- Dimension Lookup Tags. Define custom Meta Data for the Dimension Lookup.
Transformation Tab
The transformation tab allows you to map Staging table columns to Attributes during the transformation.
There are 2 ways to define the mapping:
- Use the Auto Match link below the bottom right of the table to automatically match Staging to Target Attributes based on Name.
- Select the Mapping manually.
- Double click a row or click Edit Mapping button to change the mapping for the column, define an expression mapping, and add a description to the mapping for documentation purposes. Click the button and the Transform Mapping dialog appears.
- Staging Column Mapping. Select the Staging table and Staging Column mapping for the Target Attribute from the drop downs.
- Write an expression for the mapping. Note the expression can only reference Staging columns. The expression language is SSIS Expression Language.
- Transform Mapping Description. A description of the mapping for documentation purposes. Non Mandatory.
- Transform Mapping Tags. Define custom meta Data for the mapping.
Advanced Tab
- Execution Order. Define an Execution order for this Transform in its phase. If this Transform should be run after other Transforms then elevate its Execution order. This property is not currently used by the standard ETL generation templates. Execution order is determined by the Batch workflow document.
- Transform Tags. Define custom Design/Meta Data for the Transform.
- Auto Generate Key. This should remain true for Facts. If there is a case for a non-auto generated key for a fact, this can be set to false.