The format of sysssislog in SSIS 2005 and 2008 isn't great for human consumption, so what follows is an attempt to reformat it for easier digestion. It isn't perfect - and here's hoping that the new Denali (the next version of SQL Server) server-side toolset will improve the situation it as much as I think it will.
Raw sysssislog Table Listing
My difficulty is with the nature of the logging - it "bubbles up" the same message to each containing task. The result is that the same message is reported multiple times in the log, which doesn't make it very concise (for reading). There's no doubt that it does make it easier to see everything that happened to or within a specific task - but that's not typically how I read the logs. Issuing a simple query such as:SELECT id, event, starttime, source, message FROM sysssislog ORDER BY id
Results in a rowset like this (click to enlarge):
My difficulty with the above is that (in this case) there are two rows per "message" - and that's just a simple demo package. More typically, I've got five or more nested containers, resulting in a lot of repetition.
The Distilled Query
My desire was to distill the log records such that there was only one row per "message" instead of several. You can't simply group by the message, of course. You have to group only adjacent entries and also (preferably) still show the involvement of each level of the Task hierarchy, because that context information is still valuable.
What I've arrived at is the following, which will distill the last 1000 entries in the log into something less wordy. It's still not ideal - tasks executing in parallel get mashed together. Those of you who are experts in T-SQL and windowing functions will no doubt be able to optimize this. Please post a comment if you do!WITH
LOG_SUBSET AS
(
SELECT TOP 1000 id, REPLACE(event, 'User:', '') AS event, starttime, source, message
FROM sysssislog
ORDER BY id DESC
),
ORDERED_LOG_SUBSET AS
(
SELECT TOP 1000 id, event, starttime, source, message
FROM LOG_SUBSET
ORDER BY id
),
EVENTTYPED_LOG AS
(
SELECT id, event, starttime, source, message,
HASHBYTES('SHA1', CAST(starttime AS CHAR(20)) + event + message) AS eventtype_group_key
FROM ORDERED_LOG_SUBSET
),
CONTIGUOUS_EVENT_GROUPED_LOG AS
(
SELECT id, event, starttime, source, message, eventtype_group_key,
ROW_NUMBER() OVER (ORDER BY id) AS s1,
ROW_NUMBER() OVER (PARTITION BY eventtype_group_key ORDER BY id) AS s2,
ROW_NUMBER() OVER (ORDER BY id) - ROW_NUMBER() OVER (PARTITION BY eventtype_group_key ORDER BY id) AS sequence
FROM EVENTTYPED_LOG
),
GROUPED_LOG AS
(
SELECT id, event, starttime, source, message, eventtype_group_key, s1, s2, sequence,
HASHBYTES('SHA1', CAST(eventtype_group_key AS CHAR(36)) + CAST(sequence AS CHAR(10))) AS event_group
FROM CONTIGUOUS_EVENT_GROUPED_LOG
)SELECT MIN(id) AS id, event, starttime, message,
SUBSTRING(REPLACE(REPLACE(
(
SELECT ' < ' + source
FROM GROUPED_LOG AS Y
WHERE X.event_group = Y.event_group
GROUP BY source
ORDER BY MIN(id) DESC
FOR XML PATH('')
),
'<', '<'), '&', '&'), 3, 2000) AS source_list
FROM GROUPED_LOG AS X
GROUP BY event, starttime, message, event_group
ORDER BY MIN(id)
A sample of the resulting rowset (click to enlarge):
It's not fantastic, but it does improve my diagnostic tracing a little bit.
Thanks to information from Boneist and Tom Cooper on grouping, and Anith Sen on pivoting.