August 24, 2011 at 11:52 pm
Hi,
I have an xml data
'<Products>
<Product>
<UserId>174</UserId>
<ISSN>1553-7358</ISSN>
<Authors><Author name="Abu Lerche"/><Author name="Jianmin cristy"/><Author name="Rohan Yuste"/></Authors>
<Volume>7</Volume>
</Product>
<Product>
<UserId>176</UserId>
<ISSN>1097-6256</ISSN>
<Authors><Author name="Booker Malik"/><Author name="Mehnab Devid"/><Author name="Yarik Metha"/></Authors>
<Volume>5</Volume>
</Product>
</Products>'
like this.
I want to store this data into a table like this :
UserIdISSNAuthorsVolume
1741553-7358Abu Lerche 7
1741553-7358Jianmin cristy7
1741553-7358Rohan Yuste 7
1761097-6256Booker Malik5
1761097-6256Mehnab Devid5
1761097-6256Yarik Metha 5
Please help me
August 25, 2011 at 12:13 am
Something along those lines?
DECLARE @xml XML
SELECT @xml='<Products>
<Product>
<UserId>174</UserId>
<ISSN>1553-7358</ISSN>
<Authors>
<Author name="Abu Lerche"/><Author name="Jianmin cristy"/><Author name="Rohan Yuste"/></Authors>
<Volume>7</Volume>
</Product>
<Product>
<UserId>176</UserId>
<ISSN>1097-6256</ISSN>
<Authors><Author name="Booker Malik"/><Author name="Mehnab Devid"/><Author name="Yarik Metha"/></Authors>
<Volume>5</Volume>
</Product>
</Products>'
SELECT
T.c.value('UserId[1]','INT') as UserId,
T.c.value('ISSN[1]','VARCHAR(30)') as ISSN,
U.v.value('@name[1]','VARCHAR(30)') as Authors,
T.c.value('Volume[1]','INT') as Volume
FROM @xml.nodes('Products/Product') T(c)
CROSS APPLY T.c.nodes('Authors/Author') U(v)
August 25, 2011 at 3:09 am
Thanks a lot!!!!
August 25, 2011 at 4:41 am
This is very impressive, but there is one thing that I don’t understand. I would expect that the second nodes operator will get the path Product/Authors/Author because the first nodes operator produced 2 XMLs that each one of them started with the tag Product. In reality the second nodes operator got the path Authors/Author and it worked. When I played with it and tried working with the path Product/Authors/Author, I didn’t get any results. The results show that I’m wrong, but I have to admit that I don’t understand why. Can you pleas explain it?
Adi
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
August 25, 2011 at 5:08 am
Let's take the query apart:
It basically can be divided into two parts:
Part 1:
SELECT
T.c.value('UserId[1]','INT') as UserId,
T.c.value('ISSN[1]','VARCHAR(30)') as ISSN,
T.c.value('Volume[1]','INT') as Volume
FROM @xml.nodes('Products/Product') T(c)
This will basically extract UserId a.s.o. of each Product node:
UserIdISSNVolume
1741553-73587
1761097-62565
The second part uses the CROSS APPLY syntax, which does exactly what it sounds like: it applies the Authors/Author node against each Products/Product. It's similar to a conditional cross join.
SELECT
U.v.value('@name[1]','VARCHAR(30)') as Authors
FROM @xml.nodes('Products/Product') T(c)
CROSS APPLY T.c.nodes('Authors/Author') U(v)
As you can see, the CROSS APPLY is based on the T.c structure, which is a notation for partial XML structures still holding data in an XML structure (it's similar to the Table.column notation, therefore I usually use the T(c) alias). In this case, the "table" T has one "column" c with two "rows" in XML format, one for each Product.
You can imagine the internal structure T(c) like the following table structure where xref is an internal reference to the related Product node:
DECLARE @T TABLE (xref INT, c XML)
INSERT INTO @T
SELECT
1, '<Authors>
<Author name="Abu Lerche"/><Author name="Jianmin cristy"/><Author name="Rohan Yuste"/></Authors>' UNION ALL
SELECT
2,'<Authors><Author name="Booker Malik"/><Author name="Mehnab Devid"/><Author name="Yarik Metha"/></Authors>'
SELECT * FROM @T
When using CROSS APPLY T.c.nodes('Authors/Author'), The data of the Author node(s) will be extracted into another "table" U, "column" v (notation: U(v)). This will be performed against each "row" of "table" T, leading to the expected result.
If you look at it, it becomes obvious that the path Product/Authors/Author cannot be found, hence no results.
As an alternative, you could use
SELECT
T.c.value('../../UserId[1]','INT') as UserId,
T.c.value('../../ISSN[1]','VARCHAR(30)') as ISSN,
T.c.value('@name[1]','VARCHAR(30)') as Authors,
T.c.value('../../Volume[1]','INT') as Volume
FROM @xml.nodes('Products/Product/Authors/Author') T(c)
But this would imply to "climb up" two node levels for each and every name attribute leading to a significant performance degrade.
I hope I didn't confuse you too much... 😉
But if I did, just let me know. 😎
August 25, 2011 at 7:30 am
First of all thank you very much for the detailed post. I appreciate the time and effort that you spent in helping me understand your solution. Your explanation is very good, but there is one point that I don’t understand. The script bellow shows the XML that is used and your query. I’ve also added a new column to the query that returns the XML that was created by the first nodes operator without any change.
DECLARE @xml XML
SELECT @xml=
'<Products>
<Product>
<UserId>174</UserId>
<ISSN>1553-7358</ISSN>
<Authors>
<Author name="Abu Lerche"/>
<Author name="Jianmin cristy"/>
<Author name="Rohan Yuste"/>
</Authors>
<Volume>7</Volume>
</Product>
<Product>
<UserId>176</UserId>
<ISSN>1097-6256</ISSN>
<Authors>
<Author name="Booker Malik"/>
<Author name="Mehnab Devid"/>
<Author name="Yarik Metha"/>
</Authors>
<Volume>5</Volume>
</Product>
</Products>'
--Last line in the select clause was added by me to show the what each
--line from T(c) should look like. As expected it is an XML with the
--elenmt Product as its root element
SELECT
T.c.value('UserId[1]','INT') as UserId,
T.c.value('ISSN[1]','VARCHAR(30)') as ISSN,
U.v.value('@name[1]','VARCHAR(30)') as Authors,
T.c.value('Volume[1]','INT') as Volume,
T.c.query('.') as ProductOfFirstNoeds --the new column
FROM @xml.nodes('Products/Product') T(c)
CROSS APPLY T.c.nodes('Authors/Author') U(v)
The fallowing xml is the exact copy that of one of the XMLs that I got in the new column. As expected it’s root element is Product
<Product>
<UserId>174</UserId>
<ISSN>1553-7358</ISSN>
<Authors>
<Author name="Abu Lerche" />
<Author name="Jianmin cristy" />
<Author name="Rohan Yuste" />
</Authors>
<Volume>7</Volume>
</Product>
Now I’ll take this XML and use it in 2 quires with node operator. In the first query I’ll use the path that you used in your query (Authors/Author). In the second query I’ll use the path that I thought that would work, but it didn’t work in your query:
--Declaring XML and using an XML that I got from the previous
--script (that we got into T(c) from nodes operator)
declare @xml xml
set @xml =
'<Product>
<UserId>174</UserId>
<ISSN>1553-7358</ISSN>
<Authors>
<Author name="Abu Lerche" />
<Author name="Jianmin cristy" />
<Author name="Rohan Yuste" />
</Authors>
<Volume>7</Volume>
</Product>'
--Using the path that you used in your query. In this query it doesn’t work (I get an empty recordset)
select U.v.query('.')
from @xml.nodes('Authors/Author') U(v)
--Using the full path that includes the Product element.
--It works with a this query, but doen’t work with your query
select U.v.query('.')
from @xml.nodes('Product/Authors/Author') U(v)
In short I don’t understand howcome the path that you used in the second node method works and the path that is used in your query.
Adi
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
August 25, 2011 at 7:44 am
The very simple answer would be:
Since I don't have to reference the Product node when getting the value for a UserId (-> T.c.value('UserId[1]','INT') as UserId insteand of T.c.value('Product[1]/UserId[1]','INT') as UserId), I don't have to reference the node when using CROSS APPLY either.
I think the XQuery logic simply removes the root node internally. But that's guessing from this point on.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply