Auditing in SSIS can be a real headache. In this two part series blog I am going to try to make it a little easier. You can also download my white paper and sample files on Auditing SSIS here. You can also see this post and more at SQL University.
SQL Server Integration Services (SSIS) is a powerful tool used to move data. Once you have created several SSIS packages and scheduled them to run in your production system you inevitable have failures at some time. Having a robust auditing framework will make troubleshooting and performance tracking your packages much easier.
Imagine you have dozens of SSIS packages that run in SQL Agent jobs throughout the night in your production system. You arrive at work the next morning you find data is missing for reports that are critical to the business. It is your job to find out why this data is missing. First you need to know the package that is loading this data. Next you need to find the error (if there is one) or any other issues with the package or packages.
Figure 2
The first stop for most SSIS troubleshooters is the SQL Agent job log. In this example all of the packages ran with no error last night. You can see this job history in Figure 2. What now? If you have logging or a form of detailed auditing on your packages then you would be able to track down the issue. In this example the issue was the package loaded no rows due to a where clause in a Data Flow Source so there was no error to find. This problem would be obvious if you have auditing on the row counts of your Data Flows.
Figure 3
With packages spread across multiple servers, having all of the auditing data in one location makes tracking SSIS package issues and performance easier. A form of centralized auditing would log the run time information of packages from each of your servers. In Figure 3 you can see an example of this type of setup. There is a central database that holds all of the auditing data. Several methods exist for monitoring SSIS, native logging, and custom logging frameworks.
1.1 Logging
The native logging feature in SSIS can write information about the package run into several locations. To open the logging screen right click in the control flow of an SSIS package and select logging. The logging menu can be seen in Figure 4.
Figure 4
Logging to Files
Once in the Logging window you will need to place a check next to the package name in the top left. Select SSIS log Provider for Text Files and click add. Then place a check next to the newly added log provider. Under Configuration select a file name where the logging information will be saved. See figure 5 for an example of this set up.
Figure 5
Under the Details tab you can select which options you would like to save and on which events. Select the following event handlers, OnError, OnWarning, OnPostExecute, and OnPreExecute. These Events call any tasks you have added to the corresponding Event Handler window. For Example, when a task is run in the Control Flow the onPreExcute task is called, if you have placed an Execute SQL Task in the onPreExecute event handler window, the Excute SQL Task would execute in the Event Handler. If the Control Flow task causes an error, the onError event will be called and execute any task under the onError. These are the most common event handlers logged for packages. They tell you about errors, warning, start times, and stop times.
The advanced button at the bottom of the details screen allows you to select the information collected at each event. Leave the advanced settings at default for this example.
Figure 6
Once you have configured the text file logging for the package run the package one time by right clicking on the package name in the solution explorer and click execute package, and then open the log file.
Note: There is a Reset Package in the solution you can run to truncate the tables and move the file back.
In figure 7 you can see a small extract from the log file. This file is approximately 550 lines of text. This is a lot of information about a small package. A lot of the information is repeated also. Some of this repeating is due to the how the event handlers are fired in SSIS. The SSIS packages fire some events multiple times, once for the package and once for the tasks. This makes the logging cumbersome to read and hard to find the important information you are looking for in a logging solution.
Figure 7
Since this log file is a CSV you can open it in excel. A major problem occurs in trying to read through the error messages in Excel. Some SSIS errors contain commas in there description. This breaks the error up into separate columns. Trying to load this file into a table or some other data driven tool would be problematic at best. If you want to log the information to a table, you should select the SQL Server provider in the logging options in the first place.
Logging to SQL Server
Logging to a SQL Server table gives you querying power of the database engine. To set this up, go back to the logging menu by right clicking in the control flow of the package and select logging. Delete the text logging if it still exists. Once in the Logging window you will need to place a check next to the package name in the top left. Select SSIS log Provider for SQL Server and click add. Then place a check next to the newly added log provider. Under Configuration select a Database where the logging information will be saved and run the package. This example will log to a database named testing.
The same logging details and options exist as in the text file logging example above. Select the following event handlers, OnError, OnWarning, OnPostExecute, and OnPreExecute. Now run the package. After the package has completed, open SQL Server Management Studio (SSMS) and run the following query in in the Testing database selected in the logging options.
Select * From SysSSISLog
This will return the logging data. This table is found in the system table folder of the database. Now that the data is in a table you have a lot more control of how to display the data. The issue still exist where the messages (which is the information you need) is duplicated. It is shown once for the package and once for the tasks that sent the message. To control how often the data is written to the table you will need to build a custom solution in the event handler of the package.