Shredding XML

  • Hi all,

    I'm trying to process some custom xml which I included in attachment, I have it like a single file and Id' like to break it up to <testcase> level, each <testcase> can have multiple <step_number>s.... with <action> and <expected results>.

    Do you think it's possible, really appreciate your help. I tried do simple XML query with xml field but hit some namespace problem.... + remember that's it's single field with all multi level/records. I don't mind to have them in single row, I will pivot them later...

    DECLARE @xml XML =

    ( SELECT * FROM OPENROWSET(BULK 'C:\Users\mtango\Downloads\xml.txt', SINGLE_BLOB) AS data)

    --SELECT @xml c1 INTO ttt

    --select c1 FROM ttt

    SELECT

    xmlField.value('(testsuite/testcase/name)[1]', 'nvarchar(max)') as name

    ,xmlField.value('(testsuite/testcase/summary)[1]', 'nvarchar(max)') as summary

    --....etc

    FROM ttt

  • Not sure what your problem is. I was able to run your code, with two modificiations. The column should be c1, and since the name is an attribute that should be @name.

    But I don't know what sort of result you are looking for.

    This is what I ran:

    DECLARE @xml XML =

    ( SELECT * FROM OPENROWSET(BULK 'C:\temp\test.xml', SINGLE_BLOB) AS data)

    SELECT @xml c1 INTO ttt

    select c1 FROM ttt

    SELECT

    c1.value('(testsuite/testcase/@name)[1]', 'nvarchar(max)') as name

    ,c1.value('(testsuite/testcase/summary)[1]', 'nvarchar(max)') as summary

    FROM ttt

    go

    DROP TABLE ttt

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Thanks, Erland for those fixes.

    I ran your version OK, but got nothing (NULL + NULL).

    my final task to produce something like this, but for first ran name + summary will be fine too. In presented sample we have 2 testcases.

    | name | summary | step_number| expectedresults |

    | Edit cured address during checkout| Editing cured address during ad hoc checkout | 1 | pass should be successfully logged |

    | Edit cured address during checkout| Editing cured address during ad hoc checkout | 2 | Quick shop page should be displayed |

  • And the XML document made it to the table?

    And more importantly, did you ran the code that I posted, with the XML file you submitted as C:\Temp\test.xml? Or did you translate my code to your actual table?

    I ran my test on SQL 2012, but I don't think the version of SQL Server should matter.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Sorry, Erland

    it works as designed, not sure what was wrong on my first run.

    Thanks much.

    Mario

    Now I'll try do some pivoting to look it right one testcase = 1 row, so all teststeps will belong to right testsuite.

  • Thanks again Erland for you help and jumpstart,

    I finished writing loop sp and dynamic sql that will go over that xml to produce what I need

    SELECT

    c1.value('(testsuite/testcase/@internalid)[1]', 'nvarchar(max)') as id,

    c1.value('(testsuite/testcase/@name)[1]', 'nvarchar(max)') as name

    ,c1.value('(testsuite/testcase/summary)[1]', 'nvarchar(max)') as summary

    ,c1.value('(testsuite/testcase/importance)[1]', 'nvarchar(max)') as importance

    ,c1.value('(testsuite/testcase/steps/step/step_number)[1]', 'nvarchar(max)') as step

    ,c1.value('(testsuite/testcase/steps/step/actions)[1]', 'nvarchar(max)') as stepaction

    ,c1.value('(testsuite/testcase/steps/step/expectedresults)[1]', 'nvarchar(max)') as expectedResult

    --,c1.value('(testsuite/testcase/@name)[2]', 'nvarchar(max)') as name2

    -- ,c1.value('(testsuite/testcase/summary)[2]', 'nvarchar(max)') as summary2

    FROM t2

  • Does anybody knows is it possible to break this sample file (attached in fist step) by whole node = testcase ?

    Thanks

    Mario

  • Is this what you are looking for:

    DECLARE @xml XML =

    ( SELECT * FROM OPENROWSET(BULK 'C:\temp\test.xml', SINGLE_BLOB) AS data)

    SELECT @xml c1 INTO ttt

    select c1 FROM ttt

    SELECT

    c.value('@internalid', 'nvarchar(max)') as id,

    c.value('@name', 'nvarchar(max)') as name

    ,c.value('(summary/text())[1]', 'nvarchar(max)') as summary

    ,c.value('(importance/text())[1]', 'nvarchar(max)') as importance

    ,c.value('(steps/step/step_number/text())[1]', 'nvarchar(max)') as step

    ,c.value('(steps/step/actions/text())[1]', 'nvarchar(max)') as stepaction

    ,c.value('(steps/step/expectedresults/text())[1]', 'nvarchar(max)') as expectedResult

    FROM ttt

    CROSS APPLY c1.nodes('/testsuite/testcase') AS T(c)

    go

    DROP TABLE ttt

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • big tack

  • Now expanded my test into xml with multiple <testsuit>s and ran into small issue, I have same attribute name <name> for <testsuit> node and <testcase> node so produce merged output from both on this statement:

    ....

    c.value('@name', 'nvarchar(max)') as name

    ....

    --- like on this excerpt

    <testsuite name="Edit cured">

    <node_order>0</node_order>

    <details />

    <testcase internalid="111" name="Edit cured address during checkout">

    <node_order>1</node_order>

    <externalid>2036</externalid>

    <version>1</version>

    .....

    Is there any way to put logic to process them separately

  • Indeed you can. You can have multiple CROSS APPLY on nodes that relate to each other:

    SELECT

    testsuite.c.value('@name', 'nvarchar(20)') as testsuitename,

    tescase.c.value('@internalid', 'int') as testcaseid,

    testcase.c.value('@name', 'nvarchar(20)') as testcasename

    FROM ttt

    CROSS APPLY c1.nodes('/testsuite') AS testuite(c)

    CROSS APPLY testsuite.c.nodes('testcase') AS testcase(c)

    I have changed the data types as you should be more judicious which types you have. nvarchar(MAX) has a higher overhead than regular nvarchar.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Erland Sommarskog (5/5/2015)


    Indeed you can. You can have multiple CROSS APPLY on nodes that relate to each other:

    SELECT

    testsuite.c.value('@name', 'nvarchar(20)') as testsuitename,

    tescase.c.value('@internalid', 'int') as testcaseid,

    testcase.c.value('@name', 'nvarchar(20)') as testcasename

    FROM ttt

    CROSS APPLY c1.nodes('/testsuite') AS testuite(c)

    CROSS APPLY testsuite.c.nodes('testcase') AS testcase(c)

    I have changed the data types as you should be more judicious which types you have. nvarchar(MAX) has a higher overhead than regular nvarchar.

    Minor correction to your code:

    SELECT

    testsuite.c.value('@name', 'nvarchar(20)') as testsuitename,

    testcase.c.value('@internalid', 'int') as testcaseid,

    testcase.c.value('@name', 'nvarchar(20)') as testcasename

    FROM ttt

    CROSS APPLY c1.nodes('/testsuite') AS testsuite(c)

    CROSS APPLY testsuite.c.nodes('testcase') AS testcase(c)

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Thanks much Erland and Alan !!!!!

    Best

    Mario

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply