My name is DJ Maley and I’m going to be contributing to this blog more from a Microsoft BI perspective and how to utilize their BI stack for your organization. Some of these posts will be dealing with the reporting tools that integrate very well with SharePoint and others will be dealing with strictly more BI related stuff from a data warehousing perspective. This first post is dealing with a common problem of how to setup your SSIS ETL packages so you can edit them without having to directly edit the package by using package configurations.
Microsoft provides multiple ways to store information about properties and objects associated with your SSIS ETL packages through package configurations. This provides a way to edit objects inside the packages, without actually opening them up and saving them. It also enables a way to dynamically change objects depending on the environment and settings used to help migrating ETL through the development process. Using the wizard, there are 5 general configuration types we can choose from:
- XML configuration file
- Environment variable
- Registry entry
- Parent package variable
- SQL Server
I’m going to explain a little bit of each one and tell you one of my preferred ways for setting up your packages to enable seamless migration between environments.
XML Configuration file
- This stores any of your package information in the XML format in a physical file. The package will need access to this file, so ensure that you place it either on a common drive between the different environments or that you use an environment variable to point to where the file will be located.
- You can store a property of an object in an environment variable.
- Note: You can only store one property at a time with each Environment Variables, Registry Entries, and Parent Package variables.
- Storing the property within a registry entry for configuration
Parent Package Variable
- This relies on a master package calling the child package and passing a variable to it
- This stores any of your package information in an SSIS Configuration table. The user executing the package will need access to the database to read the configuration information.
Preferred methods for moving between different environments.
When setting up typical ETL migration environments, one typically has a Development box, QA, UAT, and Production. Sometimes these can be combined (such as Dev/QA or QA/UAT) but we need to build an infrastructure for moving packages between the environments without having to edit the packages. This is my preferred approach for setting up an easy to use infrastructure for ETL.
SQL Server + Indirect XML configuration file:
- Note: most of these steps are for getting started, if you already have your XML file and your environment variable pointing towards it, then you can skip most of these steps until the end.
- Primary connection – First we setup a primary connection for connecting to our SSIS Configuration table. When first setting this up, we will be on our development machine, so we will want to connect to the development environment database that will be hosting our SSIS Configuration table. For this example, our server will be called DevServer and our database will be called ETL_Admin.
- Next rename the connection. This connection can be named “Primary” or “SSIS_Config” or whatever you prefer, but the primary purpose of this connection is to just make sure it’s connecting to the SSIS Configuration table.
- After this, we are going to enable package configurations by selecting the checkbox
- Now we are going to add a configuration, and make the configuration type an XML configuration file. Specify a directory where you are going to access it. You can put it wherever because it will probably change on different environments and this is what the environment variable will be for later. For my example, I’m putting it in C:\SSIS_Config\ and naming it SSIS_Config.dtsConfig.
Now scroll up to the Connection Managers category and expand your Primary connection properties. This is where you will select all of the different properties you want to store in your connection such as ConnectionString. Now name your connection and click finish.
- This step stored all of the properties in the file you specified in the last step, now we are going to point to that file, and change it for all of our environments.
Next create an environment variable on all machines and servers that will be working with the development/migration process. We can name this ETL_Primary, or SSIS_Config, or whatever you need to make sure you know what it is when looking for it. You are going to change the value of this environment variable depending on where you are storing your XML file. For example, I may have a different structure on the actual servers that is different from my machine, so it could look like this:
- My machine: C:\SSIS_Config\SSIS_Config.dtsConfig
- Dev Server: E:\SSIS_Config\SSIS_Config.dtsConfig
- QA Server: E:\SSIS_Config\SSIS_Config.dtsConfig
The next step involves copying this XML file to the different servers, and changing the configuration to fit that environment. So I will copy my SSIS_Config.dtsConfig verbatim from my machine to the Dev server, since it’s pointing at the same location. But for QA we may change the Data Source part of our connection string to point to:
- For QA: QAServer
- UAT: UATServer
- Prod: ProductionServer
- You can also use IP addresses as well
- Now after placing the XML configuration files with the respective changes into each environment, and after creating an environment variable in each environment to point towards it, we need to restart the machines for the environment variables to show up.
- After the machines have been restarted, we can go back into our Package Configurations Organizer, and edit the previous configuration we made. Now that we have an environment variable to point to, we can choose it from the drop down below.
- We now have a dynamic primary connection that will find the correct SSIS Configuration table depending on which environment it is running in. Now we can start using SQL Server connections for storing all of the rest of our configuration information, and we only have to worry about migrating changes between environments through SQL scripts without having to edit packages directly.
For a breakdown of how this is happening, I made a diagram below.
Test.dtsx wants to load into the DataWarehouse database.
When running it in Development, it wants to use the server DevServer.
When running it in QA, it wants to use the server QAServer.