Help with a "simple" XML example, please

Question

Help with a "simple" XML example, please

Forum Etiquette: How to post Reporting Services problems

[/url]

Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
[/url]

How to Post Performance Problems - by Gail Shaw
[/url]

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 1

Evil Kraig F (11/9/2012)
Jeff Moden (11/9/2012)
Thanks for the help.
I looked up what the 3rd operand stands for in OPENXML and it appears to control whether you're using element or attribute based XML. In 2k5 BOL, the supposedly allowed values are 0, 1, 2, and 8. What does "3" mean?
Also, how do you know when to add the dots or not?
AARGH. Typo from stolen code that stayed behind. The '2' is correct for element centric mapping. I have absolutely no idea why the original had a 3, nor does the original post explain it. It's a useless attribute. I thought I'd edited that out too. Apologies for the confusion, they're both supposed to be 2's.
The reason you need the '.' is whenever you have the same element repeated within a node and expect multiple rows from it. Basically, without the '.', the repeated elements are treated as an error, basically the same column name twice. With the '.', you're telling the XML parser that these are multiple rows of the same element name. Normally, you'd be one node up with the DueDate being an attribute (or series of attributes/elements) belonging to the client(s), and have it as a specific element.
All the mapping does is inform the parser that this isn't an error, they're supposed to repeat.

BWAAAA-HAAA!!! Holy shades of DOS paths! 😛 I get it. Thanks, Craig.

I also would like to second Mr. Magoo above. If anyone deserves a bit of help on an issue around here, you're on the short list.

Again, truly humbled that someone like you would say such a thing. Thank you very much for the help and all that you've done to help other folks. You and Magoo are certainly dedicated to the community and that put's you on a short list of my own.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 2

mister.magoo (11/9/2012)
Ha!, you were just pretending not to know all along! 😀

BWAAA-HAAA!!! I guess that means that I did something right here but I really thought I was missing something because it just seemed too bloody complex for the simple results I was trying to achieve. I was pretty sure I was missing something because, like the old adage goes, "you don't know what you don't know."

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 3

Eugene Elutin (11/9/2012)
Given XML sample is a bit strange.

I'll be the first to agree with THAT! 😀 My boss is the one that wrote it. I like him but he's fallen in love with XML for simple configuration controls such as the XML I've posted. I'll take a nice EAV for such controls over XML any day especially since it wouldn't require me to update hard-coded dates in a string literal on steroids. :sick:

Having <Payees> parent node kind of suggest that it may contain multiple <Payee> nodes, but <DueDates> are outside of <Payee>. So it does look like "cross join" : set of Due Dates is relevant for the set of Payees. Is it right?
Or you will have only one single Payee in Payees (the parent node may be just added for convenience of client/server coding).

The anticipation for multiple payees is all mine. I try to bullet-proof code by considering some of the future requirements that aren't apparent to may at design time.

Also, you have presented single XML value in a table. Does it mean that you need to process multiple records at time? Or you need to process one single XML value (eg. input parameter in stored proc)
Why is it important?
Because when you work with XML in T-SQL it is better to stick with minimal required (optimal) processing. Creating "generic" solution may significantly effect performance.

In this case, think of it as XML RBAR that will be used for batch configuration control. However, I'll also try this with multiple rows in a table because I can see a use for that.

As a side bar, thanks for helping. You're another one of those people that have gone out of their way to help others and it's a pleasure to "know" you.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

GSquared SSC Guru Points: 260824 More actions · Answer 4

Jeff, if it helps, here's a version that looks a little more "SQLish". It does essentially the same thing as Magoo's code, but uses subqueries as an intermediate that more closely follows the usual SQL Select paradigm in format and feel.

SELECT PayeeXML.value('(/Payee/EDICompanyQualifier/text())[1]', 'char(2)') AS EDICompanyQualifier,

PayeeXML.value('(/Payee/EDICompanyCode/text())[1]', 'int') AS EDICompanyCode,

DueDateXML.value('(/DueDate/text())[1]', 'datetime') AS DueDate

FROM (SELECT R.Payees.query('.') AS PayeeXML

FROM #MyHead

CROSS APPLY SomeXML.nodes('RootA/Payees/Payee') AS R (Payees)) AS Payees -- Standard Subquery 1

CROSS JOIN (SELECT R.DueDates.query('.') AS DueDateXML

FROM #MyHead

CROSS APPLY SomeXML.nodes('RootA/DueDates/DueDate') AS R (DueDates)) AS DueDates;-- Standard Subquery 2

If you actually have more than one RootA node per entry (well-formed XML shouldn't have that, it should have a single root node), then you'd need to query that first (like Magoo's does), and then parse each of those into it's own sub-query (Cross Apply). But well-formed XML shouldn't require that step. Doesn't mean you won't have to deal with poorly-formed XML.

Just posting this as a bridge concept between what you're used to seeing in SQL queries and what you're digging into now with XQuery. I've found bridging concepts like this help a lot of DBAs begin to wrap their heads around the weird world of XML after years of SQL.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 5

mister.magoo (11/9/2012)
Ha!, you were just pretending not to know all along! 😀

As a bit of a side bar, I think I'm really bad at XML and, for some of the personal reasons I mentioned previously, have a little hatred for XML. That predisposed hatred has actually gotten in the way of me learning it like I probably should.

The funny part is, hatred can turn on innovation. I've actually written some T-SQL that will write the necessary XML to automatically and without any prior knowledge of the structure, flatten well formed element, attribute, and hybrid XML by interrogating the underlying Adjacency List formed in the "edge" table that OPENXML makes. It's one of those things that I thought XML would have a "command" for doing but never found a decent one. The code it makes is only slightly more complex than the code I posted.

Matt Miller found an "exception" that the code doesn't handle that I still need to fix or I'd post it right here.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

mister.magoo SSC-Forever Points: 47174 More actions · Answer 6

Jeff Moden (11/9/2012)
mister.magoo (11/9/2012)
Ha!, you were just pretending not to know all along! 😀
As a bit of a side bar, I think I'm really bad at XML and, for some of the personal reasons I mentioned previously, have a little hatred for XML. That predisposed hatred has actually gotten in the way of me learning it like I probably should.

Same with me and C# 😉

The funny part is, hatred can turn on innovation. I've actually written some T-SQL that will write the necessary XML to automatically and without any prior knowledge of the structure, flatten well formed element, attribute, and hybrid XML by interrogating the underlying Adjacency List formed in the "edge" table that OPENXML makes. It's one of those things that I thought XML would have a "command" for doing but never found a decent one. The code it makes is only slightly more complex than the code I posted.
Matt Miller found an "exception" that the code doesn't handle that I still need to fix or I'd post it right here.

Now, that sounds very interesting, but wait, what's that I hear....it's the hoardes of CLR enthusiasts waiting to pounce on you 😛

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 7

mister.magoo (11/9/2012)
Same with me and C# 😉
...{snip}...
Now, that sounds very interesting, but wait, what's that I hear....it's the hoardes of CLR enthusiasts waiting to pounce on you 😛

You're in good company. I don't even know how to spell "C#". 😀 I gave up on almost every form of procedural code way back in 2002 when the person I hired to replace me as Dev Manager (a terrible choice on my part) told me to write a "Caution" field with White lettering on a pale Yellow background. It's kind of a shame because I used to write some really cool stuff.

Oddly enough, I strongly embraced the idea of CLR when it first came out. I still think CLR is a great idea but have been soured against it by certain front-end developers that use it for silly stuff instead of taking 10 minutes to learn something in T-SQL. Don't know if you ever heard the story or not but I had a developer write a CLR to do a "modulus" function because he didn't think that T-SQL had one.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 8

GSquared (11/9/2012)
Jeff, if it helps, here's a version that looks a little more "SQLish". It does essentially the same thing as Magoo's code, but uses subqueries as an intermediate that more closely follows the usual SQL Select paradigm in format and feel.
SELECT PayeeXML.value('(/Payee/EDICompanyQualifier/text())[1]', 'char(2)') AS EDICompanyQualifier,
PayeeXML.value('(/Payee/EDICompanyCode/text())[1]', 'int') AS EDICompanyCode,
DueDateXML.value('(/DueDate/text())[1]', 'datetime') AS DueDate
FROM (SELECT R.Payees.query('.') AS PayeeXML
FROM #MyHead
CROSS APPLY SomeXML.nodes('RootA/Payees/Payee') AS R (Payees)) AS Payees -- Standard Subquery 1
CROSS JOIN (SELECT R.DueDates.query('.') AS DueDateXML
FROM #MyHead
CROSS APPLY SomeXML.nodes('RootA/DueDates/DueDate') AS R (DueDates)) AS DueDates;-- Standard Subquery 2
If you actually have more than one RootA node per entry (well-formed XML shouldn't have that, it should have a single root node), then you'd need to query that first (like Magoo's does), and then parse each of those into it's own sub-query (Cross Apply). But well-formed XML shouldn't require that step. Doesn't mean you won't have to deal with poorly-formed XML.
Just posting this as a bridge concept between what you're used to seeing in SQL queries and what you're digging into now with XQuery. I've found bridging concepts like this help a lot of DBAs begin to wrap their heads around the weird world of XML after years of SQL.

Man, the good folks are coming out of the woodwork! Thanks for the help, Gus!

I wonder what the performance implications would be on this since it's actually dipping the table twice. I'll add it to the list of things I need to test on this particular project.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

mister.magoo SSC-Forever Points: 47174 More actions · Answer 9

Jeff Moden (11/9/2012)
mister.magoo (11/9/2012)
Same with me and C# 😉
...{snip}...
Now, that sounds very interesting, but wait, what's that I hear....it's the hoardes of CLR enthusiasts waiting to pounce on you 😛
You're in good company. I don't even know how to spell "C#". 😀 I gave up on almost every form of procedural code way back in 2002 when the person I hired to replace me as Dev Manager (a terrible choice on my part) told me to write a "Caution" field with White lettering on a pale Yellow background. It's kind of a shame because I used to write some really cool stuff.
Oddly enough, I strongly embraced the idea of CLR when it first came out. I still think CLR is a great idea but have been soured against it by certain front-end developers that use it for silly stuff instead of taking 10 minutes to learn something in T-SQL. Don't know if you ever heard the story or not but I had a developer write a CLR to do a "modulus" function because he didn't think that T-SQL had one.

I know how that feels, once had a developer who re-wrote several stock functions (badly) because he couldn't be bothered researching the existing ones... but you can't blame the tool that it was written in, just the tool that wrote it 😛

For my part, I haven't learnt C# yet because I am sure it will be "revised" over and over to make it less "icky" - once it is, I might learn.

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

GSquared SSC Guru Points: 260824 More actions · Answer 10

Jeff Moden (11/9/2012)
GSquared (11/9/2012)
Jeff, if it helps, here's a version that looks a little more "SQLish". It does essentially the same thing as Magoo's code, but uses subqueries as an intermediate that more closely follows the usual SQL Select paradigm in format and feel.
SELECT PayeeXML.value('(/Payee/EDICompanyQualifier/text())[1]', 'char(2)') AS EDICompanyQualifier,
PayeeXML.value('(/Payee/EDICompanyCode/text())[1]', 'int') AS EDICompanyCode,
DueDateXML.value('(/DueDate/text())[1]', 'datetime') AS DueDate
FROM (SELECT R.Payees.query('.') AS PayeeXML
FROM #MyHead
CROSS APPLY SomeXML.nodes('RootA/Payees/Payee') AS R (Payees)) AS Payees -- Standard Subquery 1
CROSS JOIN (SELECT R.DueDates.query('.') AS DueDateXML
FROM #MyHead
CROSS APPLY SomeXML.nodes('RootA/DueDates/DueDate') AS R (DueDates)) AS DueDates;-- Standard Subquery 2
If you actually have more than one RootA node per entry (well-formed XML shouldn't have that, it should have a single root node), then you'd need to query that first (like Magoo's does), and then parse each of those into it's own sub-query (Cross Apply). But well-formed XML shouldn't require that step. Doesn't mean you won't have to deal with poorly-formed XML.
Just posting this as a bridge concept between what you're used to seeing in SQL queries and what you're digging into now with XQuery. I've found bridging concepts like this help a lot of DBAs begin to wrap their heads around the weird world of XML after years of SQL.
Man, the good folks are coming out of the woodwork! Thanks for the help, Gus!
I wonder what the performance implications would be on this since it's actually dipping the table twice. I'll add it to the list of things I need to test on this particular project.

Performance-wise, I'd expect Magoo's version to be faster. As mentioned, this is just a "how to read it" type of exercise.

When I was first teaching myself XML, it was even worse than when I started teaching myself relational databases and T-SQL. The documentation for XQuery was horrible. It's slightly better now, but still not good enough to really get a new person going.

Once you get it, though, SQL Server's flavor of XML and XQuery can be very, very useful tools. One of my favorite uses of the XML data type is when a business is moving into a new area of endeavor, and doesn't yet have a solid grasp on the business rules they need in the data. Add a "CustomData" column, XML datatype, to a table, and you can plug pretty much anything into it in terms of extended columns, as the business unit begins to work through their actual data needs.

You can add fields to an online form, and collect the new data, without having to add columns to a table in a production environment. Adds tremendous flexibility, which is often the single biggest need when doing some new line of business.

For example, the company I currently work for had a need to collect name-and-address type data online. Some of the data requirements were knowable right from the start, some were guessable, and some were "I guess we'll figure that out as we go". The known columns, like "email", and "do you want to opt-in for our newsletter", and "what's your prefered language", and "country" (using ISO3 codes for language and country), I built as columns with solid data types. But even NameFirst and NameLast becomes complicated when dealing with certain cultures in this world. For a famous historical example, how do you split "Hercule-Savinien de Cyrano de Bergerac" (French writer and swordsman) into "first and last name, please"? How about "Captain Sir Richard Francis Burton", what do you put in the "Title" column for that one? And don't even get me started on international mailing addresses and phone numbers! Uhg!

Till we could answer questions about what would be the most common data supplied, it was easier to have a CustomData XML column, allow some free-form text on the front end, and put the data semi-structured in the XML.

Then we reviewed it repeatedly as our sample-set got big enough, and normalized it and added rules to it, and moved data out of the XML into regularly defined columns.

As it collected, the managers and such involved made decisions about additional data to collect, and it was simplicity itself to add fields to the sites without any sort of database refactor.

The speed-to-market was huge, without sacrificing the ability to standardize and normalize later.

There are other ways to do the same thing, but none are so flexible, so easy to implement, and allow as much standardization later on.

The managers and execs of the business were very pleased with how fast and easy it was to get the site up. Major cudos for those involved. And no buyer's remorse later as it was live.

The major drawbacks were disk space used (XML is heavier on that than properly normalized relational data), and slower queries (though not as slow as you might think - XML indexes are pretty good these days). Those drawbacks fade as you move data out of the XML, once you know what the best format actually is.

That's just one clever use of XML in SQL Server. I've seen many others.

But it does take a learning curve to get comfortable with it. Definitely. Till then, it just frustrates. Once you get over the hump on the learning, it's still a little arcane at times, but you can get some pretty solid benefits out of it with minimal effort.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon