XQuery vs. OPENXML

  • I didn't fully understand the implications of the use of XQuery. I just didn't see it as something for use immediately within TSQL. On a recent post on another topic, it was pointed out that my method of using OPENXML could more readily be answered using XQuery. So I decided to try it out. I'm positive I did something wrong. Using the code below, each query ran in about 1ms. Then I decided to look at the execution plans. So I ran the queries together just so I could compare one execution plan to the next. Try it. The XQuery seems to take 95% of the time for the two queries. I would have thought they had parity or even the XQuery would be less resources since it didn't require instantiating a memory space for shreding the XML, etc.

    What did I do wrong in my tests, specifically in the XQuery. I think that's where the issue lays because this is the first time I've used it extensively.

    Just a note, I replaced the normal xml tags with brackets so that the code would be readable.

    Thanks for the help.

    DECLARE @xml AS XML

    DECLARE @iHandle AS INT

    SET @xml = '[ROOT]

    [Customer]

    [CustomerID]VINET[/CustomerID]

    [ContactName]Paul Henriot[/ContactName]

    [Order OrderID="10248" CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00"]

    [OrderDetail ProductID="11" Quantity="12"/]

    [OrderDetail ProductID="42" Quantity="10"/]

    [/Order]

    [/Customer]

    [Customer]

    [CustomerID]LILAS[/CustomerID]

    [ContactName]Carlos Gonzlez[/ContactName]

    [Order OrderID="10283" CustomerID="LILAS" EmployeeID="3" OrderDate="1996-08-16T00:00:00"]

    [OrderDetail ProductID="72" Quantity="3"/]

    [/Order]

    [/Customer]

    [Customer]

    [CustomerID]VIDA[/CustomerID]

    [ContactName]Juan Valdez[/ContactName]

    [Order OrderID="54321" CustomerID="LILAS" EmployeeID="3" OrderDate="1996-08-16T00:00:00"]

    [OrderDetail ProductID="72" Quantity="3"/]

    [/Order]

    [/Customer]

    [Customer]

    [CustomerID]LOCA[/CustomerID]

    [ContactName]Benny Urquidez[/ContactName]

    [Order OrderID="12345" CustomerID="LILAS" EmployeeID="3" OrderDate="1996-08-16T00:00:00"]

    [OrderDetail ProductID="72" Quantity="3"/]

    [/Order]

    [/Customer]

    [Customer]

    [CustomerID]LAMBA[/CustomerID]

    [ContactName]Juan Cole[/ContactName]

    [Order OrderID="55555" CustomerID="LILAS" EmployeeID="3" OrderDate="1996-08-16T00:00:00"]

    [OrderDetail ProductID="72" Quantity="3"/]

    [/Order]

    [/Customer]

    [Customer]

    [CustomerID]ALPHA[/CustomerID]

    [ContactName]Victor Hanson[/ContactName]

    [Order OrderID="22222" CustomerID="LILAS" EmployeeID="3" OrderDate="1996-08-16T00:00:00"]

    [OrderDetail ProductID="72" Quantity="3"/]

    [/Order]

    [/Customer]

    [Customer]

    [CustomerID]BETA[/CustomerID]

    [ContactName]Glenn Reynolds[/ContactName]

    [Order OrderID="11111" CustomerID="LILAS" EmployeeID="3" OrderDate="1996-08-16T00:00:00"]

    [OrderDetail ProductID="72" Quantity="3"/]

    [/Order]

    [/Customer]

    [/ROOT]'

    EXEC sp_xml_preparedocument @iHandle OUTPUT, @xml

    SELECT * FROM OPENXML(@iHandle, '/ROOT/Customer/Order',2)

    WITH (ContactName VARCHAR(50) '@OrderID')

    exec sp_xml_removedocument @iHandle

    SELECT Ord.value('(@OrderID)','int')

    FROM @xml.nodes('/ROOT/Customer/Order') AS x(Ord)

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Please, a syntax check on the XQuery. It's my first time out of the gate with this and I'm not sure I'm avoiding stupidity successfully.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Bump.

    Surely someone has a comment?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • The following seems to work faster ( ~69% compared to 95%) even though it still isn't what i expected to see. My guess is that the query cost calculation does not seem to include memory allocation and shredding when using sp_xml_preparedocument?.

    SELECT Ord.value('(@OrderID)','int') FROM @xml.nodes('/ROOT/Customer/Order[@OrderID]') AS x(Ord)

    Note that this will omit order elements that didn't have OrderID attribute(which should never happen:)).

  • Thanks.

    Much closer to 50/50 which was my initial assumption of worst case (boy was I shocked). I think I'll expand the tests a bit & start looking at actual memory & I/O costs, etc.

    Thanks again for the help.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Well - i'm not sure I follow. XQUERY is a facility used to perform queries off XML data (whether it's stored in SQL Server, a free-standing file, XML stream, etc...)

    OPENXML allows for the opening of a XML file and use it as a data set in SQL Server.

    Two niggly issues:

    1.I'm not sure you're using XQUERY in EITHER example. You're not filtering anything, so what you're doing is technically an XPATH query (meaning - return every node of a certain kind, etc...). You haven't filtered anything.

    2. The 2 syntaxes are comparing apples and oranges. The OpenXML opens up a single file and allows you to use is as a data set. The xml.NODES syntax is designed to allow you to run queries against XML columns in a table. So - if you kept 200 xml documents in a single table as separate rows, you could run one query against ALL of them in on shot, and get the results back. Using OPENXML - you'd need a loop, some place to store the results of each run , etc.... VERY different animal in other words.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • All true.

    The reason I'm chasing this down is because on another thread, someone was asking how to parse an XML file out into it's component parts for loading into a table. I went with OPENXML. I was corrected by someone pointing out that I could, more easily, use XQuery. So, I start to experiment with it. At first blush, it looked better, a little cumbersome to use (XPath syntax is weird), but no worse than OPENXML. But the performance difference was a bit of a shock. This especially because while we use OPENXML, if not frequently, regularly, and it's always the slow point for performance. At this point, XQuery is making OPENXML look a bit speedy. At least for this application.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • The thing I'm finding interesting which seems to be throwing off the numbers is that the sp_xml_preparedocument and sp_xml_removedocument aren't part of the execution plan/effort. The second syntax has an implicit prepare (which according to what I'm seeing is 98% of the effort in query # 2).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I noticed something interesting: look at these two queries:

    SELECT Ord.value('(.)','char(5)') FROM @xml.nodes('/ROOT/Customer/CustomerID') AS x(Ord)

    SELECT Ord.value('(.)','char(5)') FROM @xml.nodes('/ROOT/Customer/CustomerID[text()]') AS x(Ord)

    The latter outperforms the former eventhough both return same # of rows. The only reason i can think of this is that the latter has a filter in xpath expr. This probably explains why one of the previous queries that used @OrderID in the xpath for the nodes gave better numbers. So use a filter where feasible.

  • I finally got around to testing some more. I created three copies of an XML structure with almost identical data (I changed some of the key values between the three sets). I then created three different queries. One that was regular OPENXML, one that was XQuery and one that was XQuery with filters. These scripts did inserts to tablels, shredding the XML out to normalized structures. I ran them several times using Profiler to capture performance.

    OPENXML ran in about 150ms

    XQuery with filters ran in about 200ms

    XQuery without filters ran in about 250ms

    I'm still surprised by this outcome. It's a small XML set, so I'd think the XPath queries would run faster than OPENXML but it doesn't seem to be true.

    I'm going to write it all up as an article & submit it to Steve. Then, people either replicate my tests or point out where I went wrong.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Before doing so I would recommend you review :

    Referencing parent nodes in XQuery is EXPENSSIVE

    Cheers,


    * Noel

  • Excellent. Thanks. I'll work it in (if I still do it).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • HI, im currently working on importing an XML document using

    exec sp_xml_preparedocument @idoc output, @xml_in

    what i want to do is to be able to pass the @xml_in a file path name like so

    create proc test_xml ( @xml_in text)

    as begin

    declare @idoc int

    exec sp_xml_preparedocument @idoc output, @xml_in

    end

    exec test_xml 'C:\WORK_FOLDER\CUSTOMER_WORK\somefile.xml'

    when i run the exec i get an error

    The XML parse error 0xc00ce556 occurred on line number 1, near the XML text "C:\WORK_FOLDER\CUSTOMER_WORK\somefile.xml".

    is it possible to pass an xml file location????

    ***The first step is always the hardest *******

  • This should be a new post, not an addition to a post from 2007 on a somewhat different subject. But, to answer your question, no, that's not what that procedure does. The parameter is the XML, not a file path. You would need to pull the XML into the database engine first. OpenRowset can do that, using the Bulk options.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks will try that

    ***The first step is always the hardest *******

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply