The Dimodelo Data Warehouse Studio code generation process
Generation Components
Several components are involved in the Code Generation process.
- A Generation Service which drives the generation process.
- A Dimodelo_Meta_Data.xml file. A de-normalized view of the Data Warehouse design Meta Data stored in a Dimodelo Data Warehouse Studio project. Dimodelo_Meta_Data.xml makes creating Generation Templates far simpler.
- Generation Providers. Generation Providers are used by the Generation Service to do the actual transformation of Dimodelo Data Warehouse Studio Meta Data into Code. Dimodelo Data Warehouse Studio provides an API for Generation Providers, so Data Warehouse professionals can create their own Generation Providers.
- Generation Templates. Generation Templates are passed to the Generation Provider along with Dimodelo_Meta_Data.xml. The Generation Provider knows how to execute the Template to produce the intended Code. For example an XSLT Generation Provider is passed an XSLT Generation Template, and executes the template, passing Dimodelo_Meta_Data.xml as the input, to produce the output code file. Templates exist for each of the different types of code objects you wish to produce, SSIS Extract packages, SSIS Dimension Transform Packages, Table DDL etc.
The Generation Process
1. The User initiates the Generation process
A user can execute the generation process by selecting the Generation option on the Dimodelo Menu of Dimodelo Data Warehouse Studio.
2. Generate Dimodelo_Meta_Data.xml
Prior to executing the generation process, the Generation Service refreshes the Dimodelo_Meta_Data.xml file so that it contains the latest project meta data.
3. Load Generation Templates
The Generation Service first retrieves a list of Generation templates for the project. Each project has its own set of templates, which are usually stored in the ProjectTemplates folder of the project directory, although this can be configured using the ‘Generation Template path’ in the project Config file. Each Generation Template is described by a simple manifest file in the same directory (.mnf extension). E.g.
<?xml version="1.0"?> <Generation_Template_Manifest> <Output_Relative_Location>SSIS_Project</Output_Relative_Location> <Template_Display_Name>Extract Procedures</Template_Display_Name> <Generation_Engine_Name> XSLT Generation Engine </Generation_Engine_Name> <Generate_For_Each>Staging/*.sg<Generate_For_Each> <Generation_Result_File_Name_Pattern>%docName%.dtsx</Generation_Result_File_Name_Pattern> <Template_Relative_Location>Extract_SSIS.xslt</Template_Relative_Location> <OperatesOn>Staging</OperatesOn> <Generates_For_Collection> <Generates_For_TargetType>Staging</Generates_For_TargetType> </Generates_For_Collection> </Generation_Template_Manifest>
The manifest, amongst other things, contains the path to the actual Generation Template and designates which Generation Provider should be used to execute it.
4. Call Generation Providers
The Generation Service calls the appropriate Generation Provider for each Generation Template, passing the Template, Template Manifest, Dimodelo_Meta_Data.xml and the active Project Configuration file. Each target environment will have its own Config file.
The Generation Service looks for Generation Providers at the ‘Generation Provider Path’ config variable. Generation Providers are described by simple manifest files in the same directory. E.g.
<?xml version="1.0" encoding="utf-8" ?> <Generation_Engine_Manifest> <Generation_Engine_Name>XSLT Generation Engine</Generation_Engine_Name> <Generation_Engine_Class>com.dimodelo.generate.XSLTGenerationEngineProvider</Generation_Engine_Class> <Generation_Engine_Assembly>XSLTGenerationEngineProvider.dll</Generation_Engine_Assembly> </Generation_Engine_Manifest>
The manifest contains the Generation Provider name, class and the dll that contains the class. This information is used by the service to invoke the provider.
5. Generate Code
The internal workings of each Generation Provider is specific to that provider. It must match the prescribed Generation Provider interface, but that is all. The provider should know how to execute the given template. Currently there is an SSIS Package Generation Provider, a generic XSLT Generation Provider, and a SSIS Project Generation Provider.
A typical process flow of a provider follows:
Retrieve each of the files in the project that match the file pattern in the <Generate_For_Each> tag of the Template Manifest.
For each matching file
Execute the Template, passing an identifier (usually the Unique Id of the Staging Table, Dimension or Fact)
Add the output to a Generation Result array. The array contains an output document name and an output string which contains the result of the transformation. The document name is defined by the document name pattern in the <Generation_Result_File_Name_Pattern> tag of the Template Manifest.
End For
Return the Result array to the Generation Service.
6. Save the Resulting Code
The Generation Service takes the Result array returned by the Generation Provider and saves each output item to a file name defined by document name provided in the array. The Generation Service combines the ‘Generation Output Path’ configuration value in the project Config file along with the <Output_Relative_Location> of the Template manifest to determine the path at which to write the file.
Dimodelo Config Files
A Dimodelo Data Warehouse Studio project can contain multiple Dimodelo Config files in the Config folder of the project. The intention is that each target environment will have its own Dimodelo Config file. The Dimodelo Config files contain the following information:
- Connection strings for Source systems, and the Staging and Data Warehouse databases.
- Where the generated code is written to.
- Where code is deployed to.
- Other custom Meta Data.
- Where the Dimodelo Data Warehouse Studio finds Generation Templates, Genration Providers, Deploy Providers and Batch task providers.
Conclusion
Dimodelo Data Warehouse Studio has a very flexible and extensible generation architecture. While the predefined Generation Templates will suit most situations and skill levels, advanced users who want to implement their own code framework can create Generation Templates and Generation Providers to suit their needs. Shortly we will post an article describing how to create your own generation template for generating SSIS packages.
One Comment