What is a derived staging table? A derived staging table is a table created in your Staging database that derives its data from other already existing Staging tables. It’s useful for creating aggregations, or allocations of data prior to the transform and load process into your Data Warehouse. It can also be useful for tables that don’t have any other source other than an SQL Query which creates the data.
To create a Derived Staging table in Dimodelo Data Warehouse Studio, do the following:
- Create a Connection manager to your Staging Database. Name the Connection Manager anything other than ‘Staging’ which is reserved. We usually call it ‘Stage’.
- Create a Source System for your Staging Database. The Source System Abbreviation must be Derived. Again the Source system name can be anything but ‘Staging’.
- Create your Staging table, and choose the ‘Derived’ source system. This ensures the Staging table is generated with the Derived prefix in the Staging database. Also the SSIS package is prefixed with Extract_Derived instead of Extract.
- In the ETL Extract, set the source connection to the staging connection you just created, and define a query that selects data from one or more existing staging tables. The query contains the logic of aggregation, allocation etc. Tip. Give the Extract an Extract Name prefixed with Derived. The default ETL Batch Workflow contains a phase that executes after all the regular Staging table Extracts and processes all extracts that start with the name Derived.
The following video demonstrates using derived staging table to implement complex ETL logic in your data warehouse.