Shredding XML in SSIS

  • Here's my scenario. I have a table that looks something like this:

    CREATE TABLE [dbo].[Logs](

    [rowGUID] [uniqueidentifier] ROWGUIDCOL NOT NULL,

    [lngClassID] [int] NOT NULL,

    [DateTimeUTC] [varchar](20) NOT NULL,

    [lngSessionID] [int] NOT NULL,

    [ClientName] [varchar](255) NOT NULL,

    [ClientIP] [varchar](255) NOT NULL,

    [imgInfo] [image] NULL)

    That last column (imgInfo) actually contains XML that looks something like what follows this paragraph. I need to capture the XML log events as children of the Logs record above and store them in a separate table. Is there an easy way to do this with SSIS 2016? Presumably, I could write some sort of C# CLR function to shred this but I can't help but wonder if there is a simpler way.

    This is a representative sample. I'm going to need to do this a lot in my overall ETL solution for data produced by several apps and some tables with multiple XML columns. I'm looking for a strategy...any help is appreciated.

    <pllog>

    <log>

    <datetime>20151021183108</datetime>

    <ms>900</ms>

    <type>Info</type>

    <action>RES ONE Workspace initializing</action>

    <result>Version 9.10.0.1</result>

    </log>

    <log>

    <datetime>20151021183108</datetime>

    <ms>900</ms>

    <type>Info</type>

    <action>Session GUID</action>

    <result>{6E332C48-207F-4619-A25C-2B8716A19E31}</result>

    </log>

    <log>

    <datetime>20151021183109</datetime>

    <ms>134</ms>

    <type>Info</type>

    <action>TS client name</action>

    <result>LAPTOP-HEV</result>

    </log>

    </pllog>

    "Beliefs" get in the way of learning.

  • Robert Frasca (11/15/2016)


    Here's my scenario. I have a table that looks something like this:

    CREATE TABLE [dbo].[Logs](

    [rowGUID] [uniqueidentifier] ROWGUIDCOL NOT NULL,

    [lngClassID] [int] NOT NULL,

    [DateTimeUTC] [varchar](20) NOT NULL,

    [lngSessionID] [int] NOT NULL,

    [ClientName] [varchar](255) NOT NULL,

    [ClientIP] [varchar](255) NOT NULL,

    [imgInfo] [image] NULL)

    That last column (imgInfo) actually contains XML that looks something like what follows this paragraph. I need to capture the XML log events as children of the Logs record above and store them in a separate table. Is there an easy way to do this with SSIS 2016? Presumably, I could write some sort of C# CLR function to shred this but I can't help but wonder if there is a simpler way.

    This is a representative sample. I'm going to need to do this a lot in my overall ETL solution for data produced by several apps and some tables with multiple XML columns. I'm looking for a strategy...any help is appreciated.

    <pllog>

    <log>

    <datetime>20151021183108</datetime>

    <ms>900</ms>

    <type>Info</type>

    <action>RES ONE Workspace initializing</action>

    <result>Version 9.10.0.1</result>

    </log>

    <log>

    <datetime>20151021183108</datetime>

    <ms>900</ms>

    <type>Info</type>

    <action>Session GUID</action>

    <result>{6E332C48-207F-4619-A25C-2B8716A19E31}</result>

    </log>

    <log>

    <datetime>20151021183109</datetime>

    <ms>134</ms>

    <type>Info</type>

    <action>TS client name</action>

    <result>LAPTOP-HEV</result>

    </log>

    </pllog>

    As you already have the data stored in a SQL Server table, may I ask why you are considering using SSIS for this, rather than using straight XQuery in T-SQL?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I haven't ruled anything out but I'm not sure how Xquery can help me when I don't know what is in the XML at the time I'm trying to shred it. I also don't have much experience with Xquery so it's entirely possible that it IS the best solution in which case, that's the direction I'll go.

    As I said, the example I provided is a representative sample of logged events, <log> tags, but I do know that some of the other XML columns I'm going to need to shred may have more than one type of record, each with it's own set of columns.

    "Beliefs" get in the way of learning.

  • Robert Frasca (11/15/2016)


    I haven't ruled anything out but I'm not sure how Xquery can help me when I don't know what is in the XML at the time I'm trying to shred it. I also don't have much experience with Xquery so it's entirely possible that it IS the best solution in which case, that's the direction I'll go.

    As I said, the example I provided is a representative sample of logged events, <log> tags, but I do know that some of the other XML columns I'm going to need to shred may have more than one type of record, each with it's own set of columns.

    OK, but if you don't know what's going to be in the XML, what logic will you use to determine what to do with it, once it has been shredded?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • My objective is to shred it and store the data in tables dynamically. If that isn't possible then I guess I'll be forced to hard code for certain tags.

    "Beliefs" get in the way of learning.

  • Robert Frasca (11/15/2016)


    My objective is to shred it and store the data in tables dynamically. If that isn't possible then I guess I'll be forced to hard code for certain tags.

    OK, can we refine what you mean by 'dynamically'?

    Do you mean that you will have a child table containing a generic VARCHAR(8000) (or whatever) column, called XMLStuff (or whatever), which will contain multiple rows like

    ('ms: 900'), ('type: Info') etc etc?

    Or do you mean that you wish to dynamically create new columns (MS, Info, etc etc) in an existing table, as part of your ETL process?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Create a new table based on the node, with new columns based on the elements dynamically. In this example, a new table called Log with columns: datetime, ms, type, action, result.

    "Beliefs" get in the way of learning.

  • Quick XML XQuery shredding example

    😎

    USE TEEST;

    GO

    SET NOCOUNT ON;

    DECLARE @TXML XML = '<pllog>

    <log>

    <datetime>20151021183108</datetime>

    <ms>900</ms>

    <type>Info</type>

    <action>RES ONE Workspace initializing</action>

    <result>Version 9.10.0.1</result>

    </log>

    <log>

    <datetime>20151021183108</datetime>

    <ms>900</ms>

    <type>Info</type>

    <action>Session GUID</action>

    <result>{6E332C48-207F-4619-A25C-2B8716A19E31}</result>

    </log>

    <log>

    <datetime>20151021183109</datetime>

    <ms>134</ms>

    <type>Info</type>

    <action>TS client name</action>

    <result>LAPTOP-HEV</result>

    </log>

    </pllog>

    '

    SELECT

    XLOG.DATA.value('(datetime/text())[1]' ,'VARCHAR(50)' ) AS [datetime]

    ,XLOG.DATA.value('(ms/text())[1]' ,'INT' ) AS [ms]

    ,XLOG.DATA.value('(type/text())[1]' ,'VARCHAR(50)' ) AS [type]

    ,XLOG.DATA.value('(action/text())[1]' ,'VARCHAR(2000)') AS [action]

    ,XLOG.DATA.value('(result/text())[1]' ,'VARCHAR(100)' ) AS [result]

    FROM @TXML.nodes('pllog/log') XLOG(DATA)

    Output

    datetime ms type action result

    --------------- ---- ------ ------------------------------- ---------------------------------------

    2015102118310 900 Info RES ONE Workspace initializing Version 9.10.0.1

    2015102118310 900 Info Session GUID {6E332C48-207F-4619-A25C-2B8716A19E31}

    2015102118310 134 Info TS client name LAPTOP-HEV

  • Robert Frasca (11/15/2016)


    Create a new table based on the node, with new columns based on the elements dynamically. In this example, a new table called Log with columns: datetime, ms, type, action, result.

    May I ask how you would determine the datatypes to use for the columns?

    This is really quite difficult. Building permanent tables at runtime is not a course of action I would recommend at all. And think about how you will keep in sync with version control ...

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Eirikur Eiriksson (11/15/2016)


    Quick XML XQuery shredding example

    😎

    USE TEEST;

    GO

    SET NOCOUNT ON;

    DECLARE @TXML XML = '<pllog>

    <log>

    <datetime>20151021183108</datetime>

    <ms>900</ms>

    <type>Info</type>

    <action>RES ONE Workspace initializing</action>

    <result>Version 9.10.0.1</result>

    </log>

    <log>

    <datetime>20151021183108</datetime>

    <ms>900</ms>

    <type>Info</type>

    <action>Session GUID</action>

    <result>{6E332C48-207F-4619-A25C-2B8716A19E31}</result>

    </log>

    <log>

    <datetime>20151021183109</datetime>

    <ms>134</ms>

    <type>Info</type>

    <action>TS client name</action>

    <result>LAPTOP-HEV</result>

    </log>

    </pllog>

    '

    SELECT

    XLOG.DATA.value('(datetime/text())[1]' ,'VARCHAR(50)' ) AS [datetime]

    ,XLOG.DATA.value('(ms/text())[1]' ,'INT' ) AS [ms]

    ,XLOG.DATA.value('(type/text())[1]' ,'VARCHAR(50)' ) AS [type]

    ,XLOG.DATA.value('(action/text())[1]' ,'VARCHAR(2000)') AS [action]

    ,XLOG.DATA.value('(result/text())[1]' ,'VARCHAR(100)' ) AS [result]

    FROM @TXML.nodes('pllog/log') XLOG(DATA)

    Output

    datetime ms type action result

    --------------- ---- ------ ------------------------------- ---------------------------------------

    2015102118310 900 Info RES ONE Workspace initializing Version 9.10.0.1

    2015102118310 900 Info Session GUID {6E332C48-207F-4619-A25C-2B8716A19E31}

    2015102118310 134 Info TS client name LAPTOP-HEV

    Thanks for the example. I think I can find a way to leverage this for examples where I'm familiar with the Node layout as in the example I provided. It's a start...

    "Beliefs" get in the way of learning.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply