How do you map to rows inside different branches of same XML document using OPENXML rowset function?

  • I'm able to map to one hierarchy of an XML document at a time, but I need to retrieve data sitting in other hierarchies of the same document.

    In the XML below

    1. Root is Joblog

    2. Header node contains server, name, start_time, type, log_name

    3. Media_drive_and_media_info node contains it's own set of children of which I am interested in (for example) media_mount_date.

    But, with the below query, only header node info is retrieved

    USE Sandbox

    GO

    declare @dochandle as int;

    declare @xmldocument as nvarchar(max);

    set @xmldocument = '

    <joblog>

    <job_log_version version="2.0" />

    <header>

    <filler />

    <server>Job server: BUMMERMMS0

    </server>

    <name>Job name: BUMMERBUILDS\DAILYBUILDS D:-Normal Weekly Backup Policy-Weekly - FULL To LTO3

    </name>

    <start_time>Job started: Wednesday, September 04, 2013 at 9:17:28 AM

    </start_time>

    <type>Job type: Backup

    </type>

    <log_name>Job Log: BEX_BUMMERMMS0_00004.xml

    </log_name>

    <filler />

    </header>

    <media_mount_date>Drive and media mount requested: 9/4/2013 9:17:28 AM</media_mount_date>

    <media_drive_and_media_info>

    <media_mount_date>Drive and media information from media mount: 9/4/2013 9:17:38 AM</media_mount_date>

    <drive_name>Drive Name: BUMMERMMS0-B2D</drive_name>

    <media_label>Media Label: B2D006436</media_label>

    <media_guid>Media GUID: {0c9fa835-f3dc-42fe-a7e6-9894842919c9}</media_guid>

    <media_overwrite_date>Overwrite Protected Until: 12/30/9999 4:00:00 PM</media_overwrite_date>

    <media_append_date>Appendable Until: 12/30/9999 4:00:00 PM</media_append_date>

    <media_set_target>Targeted Media Set Name: WEEKLY - FULL</media_set_target>

    </media_drive_and_media_info>

    </joblog>';

    EXEC sp_xml_preparedocument @dochandle OUTPUT, @xmldocument;

    SELECT

    server,

    name,

    start_time,

    type,

    log_name,

    media_mount_date

    FROM OPENXML(@dochandle, 'joblog/header', 1)

    WITH

    (

    server [varchar](20) 'server',

    name [varchar](300) 'name',

    start_time [varchar](100) 'start_time',

    type varchar(20) 'type',

    log_name varchar(100) 'log_name',

    media_mount_date varchar(100) 'media_mount_date'

    )

    EXEC sp_xml_removedocument @dochandle

    GO

    To get both Header and Media_drive_and_media_info info I have tried CROSS APPLY without success:

    SELECT

    server,

    name,

    start_time,

    type,

    log_name,

    media_mount_date

    FROM OPENXML(@dochandle, 'joblog/header', 1)

    CROSS APPLY OPENXML (@dochandle, 'joblog', 1)

    WITH

    (

    server [varchar](20) 'server',

    name [varchar](300) 'name',

    start_time [varchar](100) 'start_time',

    type varchar(20) 'type',

    log_name varchar(100) 'log_name',

    media_mount_date varchar(100) 'media_mount_date'

    )

    The CROSS APPLY attempt retrieved only the media_mount_date with NULLs for all preceding columns.

    Can you help me rewrite the query so that I can I traverse up and down the nodes hierarchies to get ALL data extracted?

    Thanks.

    --Quote me

  • Here's a slightly different approach without the OPENXML approach (I, personally, find OPENXML more complicted and it seems to be less efficient in most cases):

    SELECT

    TempXML1.Node1.value('(server/text())[1]', 'varchar(50)') AS [server],

    TempXML1.Node1.value('(name/text())[1]', 'varchar(50)') AS [name],

    TempXML1.Node1.value('(start_time/text())[1]', 'varchar(50)') AS [start_time],

    TempXML1.Node1.value('(type/text())[1]', 'varchar(50)') AS [type],

    TempXML1.Node1.value('(log_name/text())[1]', 'varchar(50)') AS [log_name],

    TempXML2.Node2.value('(text())[1]', 'varchar(50)') AS myname ,

    TempXML3.Node3.value('(media_mount_date/text())[1]', 'varchar(50)') AS media_mount_date,

    TempXML3.Node3.value('(drive_name/text())[1]', 'varchar(50)') AS drive_name,

    TempXML3.Node3.value('(media_label/text())[1]', 'varchar(50)') AS media_label,

    TempXML3.Node3.value('(media_guid/text())[1]', 'varchar(50)') AS media_guid,

    TempXML3.Node3.value('(media_overwrite_date/text())[1]', 'varchar(50)') AS media_overwrite_date,

    TempXML3.Node3.value('(media_append_date/text())[1]', 'varchar(50)') AS media_append_date,

    TempXML3.Node3.value('(media_set_target/text())[1]', 'varchar(50)') AS media_set_target

    FROM @xmldocument.nodes('/joblog') TempXML (Node)

    OUTER APPLY TempXML.Node.nodes('header') AS TempXML1(Node1)

    OUTER APPLY TempXML.Node.nodes('media_mount_date') AS TempXML2(Node2)

    OUTER APPLY TempXML.Node.nodes('media_drive_and_media_info') AS TempXML3(Node3)



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • This is what you are looking for

    EXEC sp_xml_preparedocument @dochandle OUTPUT, @xmldocument;

    SELECT

    server,

    name,

    start_time,

    type,

    log_name,

    media_mount_date,

    drive_name,

    media_label,

    media_guid,

    media_overwrite_date,

    media_append_date,

    media_set_target

    FROM OPENXML(@dochandle, 'joblog', 1)

    WITH

    (

    server [varchar](20) 'header/server/text()',

    name [varchar](300) 'header/name/text()',

    start_time [varchar](100) 'header/start_time/text()',

    type varchar(20) 'header/type/text()',

    log_name varchar(100) 'header/log_name/text()',

    media_mount_date varchar(100) 'media_mount_date/text()',

    -- some others you may be interested in...

    drive_name varchar(100) 'media_drive_and_media_info/drive_name/text()',

    media_label varchar(100) 'media_drive_and_media_info/media_label/text()',

    media_guid varchar(100) 'media_drive_and_media_info/media_guid/text()',

    media_overwrite_date varchar(100) 'media_drive_and_media_info/media_overwrite_date/text()',

    media_append_date varchar(100) 'media_drive_and_media_info/media_append_date/text()',

    media_set_target varchar(100) 'media_drive_and_media_info/media_set_target/text()'

    )

    EXEC sp_xml_removedocument @dochandle;

    Note that I included the "text()" node... This is a good "best practice" for when your are dealing exclusively with atomic text values.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • LutzM (10/7/2013)


    Here's a slightly different approach without the OPENXML approach (I, personally, find OPENXML more complicted and it seems to be less efficient in most cases):

    SELECT

    TempXML1.Node1.value('(server/text())[1]', 'varchar(50)') AS [server],

    TempXML1.Node1.value('(name/text())[1]', 'varchar(50)') AS [name],

    TempXML1.Node1.value('(start_time/text())[1]', 'varchar(50)') AS [start_time],

    TempXML1.Node1.value('(type/text())[1]', 'varchar(50)') AS [type],

    TempXML1.Node1.value('(log_name/text())[1]', 'varchar(50)') AS [log_name],

    TempXML2.Node2.value('(text())[1]', 'varchar(50)') AS myname ,

    TempXML3.Node3.value('(media_mount_date/text())[1]', 'varchar(50)') AS media_mount_date,

    TempXML3.Node3.value('(drive_name/text())[1]', 'varchar(50)') AS drive_name,

    TempXML3.Node3.value('(media_label/text())[1]', 'varchar(50)') AS media_label,

    TempXML3.Node3.value('(media_guid/text())[1]', 'varchar(50)') AS media_guid,

    TempXML3.Node3.value('(media_overwrite_date/text())[1]', 'varchar(50)') AS media_overwrite_date,

    TempXML3.Node3.value('(media_append_date/text())[1]', 'varchar(50)') AS media_append_date,

    TempXML3.Node3.value('(media_set_target/text())[1]', 'varchar(50)') AS media_set_target

    FROM @xmldocument.nodes('/joblog') TempXML (Node)

    OUTER APPLY TempXML.Node.nodes('header') AS TempXML1(Node1)

    OUTER APPLY TempXML.Node.nodes('media_mount_date') AS TempXML2(Node2)

    OUTER APPLY TempXML.Node.nodes('media_drive_and_media_info') AS TempXML3(Node3)

    No need for all those OUTER APPLIES, you could just do this:

    SELECT

    TempXML.Node.value('(header/server/text())[1]', 'varchar(50)') AS [server1],

    TempXML.Node.value('(header/server/text())[1]', 'varchar(50)') AS [server],

    TempXML.Node.value('(header/name/text())[1]', 'varchar(50)') AS [name],

    TempXML.Node.value('(header/start_time/text())[1]', 'varchar(50)') AS [start_time],

    TempXML.Node.value('(header/type/text())[1]', 'varchar(50)') AS [type],

    TempXML.Node.value('(header/log_name/text())[1]', 'varchar(50)') AS [log_name],

    TempXML.Node.value('(media_mount_date/text())[1]', 'varchar(50)') AS [media_mount_date],

    TempXML.Node.value('(media_drive_and_media_info/drive_name/text())[1]', 'varchar(50)') AS drive_name,

    TempXML.Node.value('(media_drive_and_media_info/media_label/text())[1]', 'varchar(50)') AS media_label,

    TempXML.Node.value('(media_drive_and_media_info/media_guid/text())[1]', 'varchar(50)') AS media_guid,

    TempXML.Node.value('(media_drive_and_media_info/media_overwrite_date/text())[1]', 'varchar(50)') AS media_overwrite_date,

    TempXML.Node.value('(media_drive_and_media_info/media_append_date/text())[1]', 'varchar(50)') AS media_append_date,

    TempXML.Node.value('(media_drive_and_media_info/media_set_target/text())[1]', 'varchar(50)') AS media_set_target

    FROM @xmldocument.nodes('/joblog') TempXML (Node)

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan and Lutz, very glad for reply. Thank you for showing me syntax that'll do the job. I will work on this tonight.

    --Quote me

  • @alan:

    as long as it is guaranteed there'll always be a single node for each and every node block referenced in the code, I agree that your code is more efficient.

    But how about having an unknown number of media represented by multiple nodes of media_drive_and_media_info?

    In such a scenario your code would just return the first node ignoring all following nodes, finally leading to a "wrong" result set.

    To me it seems like a classic example of "it depends" 😉

    If it can be guaranteed that each node will be unique, I'd recommend your solution. Otherwise I wouldn't. It depends. 😎



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply