SQL: Modify XML values identified through cross apply

  • I've got a data issue with some values stored in an XML column in a database. I've reproduced the problem as the following example:

    Setup Script:

    create table XMLTest

    (

    [XML] xml

    )

    --A row with two duff entries

    insert XMLTest values ('

    <root>

    <item>

    <flag>false</flag>

    <frac>0.5</frac>

    </item>

    <item>

    <flag>false</flag>

    <frac>0</frac>

    </item>

    <item>

    <flag>false</flag>

    <frac>0.5</frac>

    </item>

    <item>

    <flag>true</flag>

    <frac>0.5</frac>

    </item>

    </root>

    ')

    In the XML portion the incorrect entries are those with <flag>false</flag> and <frac>0.5</frac> as the value of flag should be true for non-zero frac values.

    The following SQL identifies the XML item nodes that require update:

    select

    i.query('.')

    from

    XMLTest

    cross apply xml.nodes('root/item[flag="false" and frac > 0]') x(i)

    I want to do an update to correct these nodes, but I don't see how to modify the item elements identified by a cross apply. I saw the update as looking something like this:

    update t

    set

    x.i.modify('replace value of (flag/text())[1] with "true"')

    from

    XMLTest t

    cross apply xml.nodes('root/item[flag="false" and frac > 0]') x(i)

    However this isn't working: I get the error "Incorrect syntax near 'modify'".

    Can this be done through this method?

    I know an alternative would be to do a string replace on the xml column, but I don't like that as being a bit unsubtle (and I'm not confident it wouldn't break things in my real-word problem)

  • I'm not sure if there's an easy way to do it using XML DML, but here's a slightly different approach: shred the xml and rebuild it

    ; WITH cte as

    (

    SELECT

    T.c.value('frac[1]','decimal(5,1)') frac

    FROM XMLTest

    CROSS APPLY

    [XML].nodes('root/item') T(c)

    )

    update XMLTest

    SET [XML] =

    (SELECT CASE WHEN frac = 0 THEN 'false' ELSE 'true' END AS 'flag',

    frac AS 'frac'

    FROM cte FOR XML PATH ('item'), ROOT('root') )



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • That's a nice idea and I'll have a think about it, however my real world example has these problem nodes buried quite deeply in a large xml document stored stored in the xml column.

  • jonegerton (7/22/2011)


    That's a nice idea and I'll have a think about it, however my real world example has these problem nodes buried quite deeply in a large xml document stored stored in the xml column.

    What do you consider a large XML file and how many levels are we talking about? Would it be possible to provide a basic example of the structure?

    Maybe there are other folks around with a different idea that might work better....



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • The problem is that "replace value of" can only operate on one node at a time, so you need to either look at FLWOR http://msdn.microsoft.com/en-us/library/ms190945.aspx to rebuild the xml (ugh!) or use a loop.

    This is one way to "loop" it:

    SELECT 'Begin' -- this sets the intial rowcount to 1

    WHILE @@ROWCOUNT>0

    UPDATE XMLTest

    SET [XML].modify('replace value of (root/item[flag="false" and frac>0]/flag/text())[1] with "true"')

    WHERE [XML].exist('root/item[flag="false" and frac>0]')=1

    SELECT XMLTest.[XML]

    FROM XMLTest

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Nice solution, MM!

    Would it be possible to set the flag to either "true" or "false" depending on the value of frac or would this require a second update?



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (7/23/2011)


    Nice solution, MM!

    Would it be possible to set the flag to either "true" or "false" depending on the value of frac or would this require a second update?

    Thanks, and yes you can do a conditional replace using

    replace value of (...) with ( if ( condition ) then "value1" else "value2" )

    but my initial tests show that it kills the query performance - and I mean KILLS it in this case because you are operating at the document level, so any conditions are full searches of the xml hierarchy again.

    There may be ways to do it without that problem, but I don't know them.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • mister.magoo (7/23/2011)


    ...

    Thanks, and yes you can do a conditional replace using

    replace value of (...) with ( if ( condition ) then "value1" else "value2" )

    but my initial tests show that it kills the query performance - and I mean KILLS it in this case because you are operating at the document level, so any conditions are full searches of the xml hierarchy again.

    There may be ways to do it without that problem, but I don't know them.

    I know of the conditional replace in general but I'm not sure whether it can be applied here...

    Maybe something like this (borrowing heavily from your very fine code)? It seems to work...

    SELECT 'Begin' -- this sets the intial rowcount to 1

    WHILE @@ROWCOUNT>0

    UPDATE XMLTest

    SET [XML].modify('replace value of (root/item[flag="false" and frac>0 or flag="true" and frac=0]/flag/text())[1] with

    ( if

    (root/item/flag/text()[1] ="false" and frac>0) then "true" else "false"

    )')

    WHERE [XML].exist('root/item[flag="false" and frac>0 or flag="true" and frac=0]')=1



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Hi Lutz,

    I personally think your if clause is not covering all the bases...I would do it like this:

    SELECT 'Begin' -- this sets the intial rowcount to 1

    WHILE @@ROWCOUNT>0

    UPDATE XMLTest

    SET [XML].modify('

    replace value of (root/item[(flag="false" and frac>0) or (flag="true" and frac=0)]/flag/text())[1]

    with

    (

    if ((root/item[(flag="false" and frac>0) or (flag="true" and frac=0)]/frac)[1] = "0") then

    "false"

    else

    "true"

    )')

    WHERE [XML].exist('root/item[(flag="false" and frac>0) or (flag="true" and frac=0)]')=1

    SELECT XMLTest.[XML]

    FROM XMLTest

    And it turns out it doesn't kill performance as much as I thought previously - I had made a typo that meant it was an infinite loop! - but it is still pretty bad - taking 14 seconds to process 9000 rows on my PC.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • What do you consider a large XML file and how many levels are we talking about? Would it be possible to provide a basic example of the structure?

    The XML I supplied is much simplified as the realworld version could be hundreds of lines. The problem nodes I have are 6 levels in, and could occur half a dozen times in a single xml field.

  • @mm.

    I think the answer you propose is the best I've seen so far.

    I like the suggestion of using FLWOR for this - may not be nicer but an interested for the purposes of learning. An ideas on how that might work?

  • jonegerton (7/25/2011)


    @MM.

    I think the answer you propose is the best I've seen so far.

    I like the suggestion of using FLWOR for this - may not be nicer but an interested for the purposes of learning. An ideas on how that might work?

    Sure, but the actual implementation would depend so much on the XML you are handling....

    Here is a very simple sample based on the root-item-flag/frac model...

    UPDATE dbo.XMLTest

    SET [XML] = [XML].query('

    for $r in root

    return

    <root>

    {

    for $f in $r/item

    return

    <item>

    <flag>

    {

    if (string($f/frac[1])="0")

    then "false"

    else "true"

    }

    </flag>

    <frac>{string($f/frac[1])}</frac>

    </item>

    }

    </root>

    ')

    And for this simple example it is actually more than twice as quick as the previous .modify method.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • I see - I can't use FLWOR just to substitute the problem area - I have to recreate from scratch? - As I mentioned earlier my actual problem XML is much bigger.

  • It really depends on the actual XML structure whether you can do what you want with FLWOR.

    As I said, this is a simple example, you can do more complicated work with it, but without a sample of what you are trying to do I can't really help any more.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

    Viewing 14 posts - 1 through 13 (of 13 total)

    You must be logged in to reply to this topic. Login to reply