May 2, 2012 at 2:01 am
I have a table as below.
CREATE TABLE myResultTable(
RowId IDENTITY(10000,1) NOT NULL,
ItemId INT,
ItemResult [text] NULL,
PRIMARY KEY(Id)
)
ItemResult column will have XML data. XML Schema will be based on ItemId. That means, for each ItemId, ItemResult will follow specific XMLSchema.
For eg, the table may contain rows as below.
--------------------------------------------------
RowId ItemId ItemResult
---------------------------------------------------
1 1 <Book Id='1'><Name>BookName</Name></Books>
2 2 <Author Id='1'><Name>AuthorName</Name></Author>
3 1 <Book Id='2'><Name>BookName1</Name></Book>
4 2 <Author Id='2'><Name>AuthorName1</Name></Author>
----------------------------------------------------
There can be n number of ItemIds, that i dont know during real time.
Now my requirement is to write a generic function or stored procedure, which should get ItemId as input and converts all corresponding ItemResults into a table. Given ItemId as 2, I should see table representing all Author information. How would I achieve this?
May 2, 2012 at 2:19 am
If you would like a generic function to flatten the Xml down to a table regardless of the schema of the Xml, then I would look at using something this http://beyondrelational.com/modules/2/blogs/28/posts/10495/xquery-lab-58-select-from-xml.aspx as a starting point and adapting it to suit your needs.
I've used it myself in the past and it works well.
May 2, 2012 at 3:34 am
Thanks for your reply. This looks promising and I will give it a try.
Will there be any performance hit if there are millions of rows. Since I am writing this for reporting, we cannot wait for more than few seconds. Do you have any recommendation to do in a better way?
May 2, 2012 at 3:57 am
Shredding millions of blobs of xml to a flat table structure will take a lot of resources which ever way you do it and if this is something that you would need to do on a regular basis then I would reconsider the design of your tables and store the data in a relational format rather than as xml blobs. Is it possible to shred the Xml to a flat table structure during the insert operation of that blob?
May 2, 2012 at 4:58 am
As I said earlier, I don't know what each items are about and their XML schema. So it is highly impossible for me to create flat table structure.
May 3, 2012 at 8:10 am
Ok, I was thinking more along the lines of a generic relational table structure rather than a specific structure. If you look at the output of the XMLTable function then that output is generic and could be the basis of a way that you store the data that is used for reporting. I don't know if you have an OLTP application which uses/populates the table you originally posted that requires the xml as they are, but you could have a process that shreds data from that table into a different set of generic tables that are then used for reporting.
Because Xml by its very nature is highly structured, there is no real generic "convert-xml-to-flat-table" function. You can use the xml function nodes() to shred the xml to a flat table but that does require you to know the xml schema beforehand.
If the xml blobs that are being saved are always as simple as your original post then in conjunction with the XMLTable function, you could pivot the returned data into a single row flat table. But this isn't ideal as you could lose the context of the data completely if the xml blobs are any more structured.
I created this generic script earlier today (2008 R2) as a means to take any simple xml document, uses the XMLTable function (from earlier post) to shred it and then pivot the results , but this has some limitations and wouldn't recommend it.
DECLARE @Cols VARCHAR(MAX) = ''
DECLARE @Xml AS XML = CAST('<Author Id="1"><Name>AuthorName</Name><Name>AuthorName</Name></Author>' AS XML)
DECLARE @SQL NVARCHAR(MAX)
--Get a list of the columns from the shreded xml document. need to replace certain characters
SELECT @Cols = @Cols + '[' + REPLACE(REPLACE(XPath, '[', '('), ']', ')') + '],'
FROM dbo.XMLTable(@Xml)
WHERE Value IS NOT NULL
SET @Cols = LEFT(@Cols, LEN(@Cols) -1)
--Generate SQL statement to pivot the rows from XMLTable function in a single row table
SELECT @SQL = 'SELECT ' + @Cols + '
FROM
(
SELECT REPLACE(REPLACE(XPath, ''['', ''(''), '']'', '')'') AS XPath, Value
FROM dbo.XMLTable(@Xml)
WHERE Value IS NOT NULL
) x
PIVOT
(MAX(Value) FOR xpath IN (' + @Cols +')) y'
EXEC sys.sp_executesql @SQL, N'@Xml XML', @Xml = @Xml
Other than that, I'm struggling for ideas and hope that others from the community may be able to post their thoughts.
July 28, 2016 at 4:00 am
If you have deeply nested XML documents (or json, html, sql) with recursive nodes (node 'folder' in node 'folder' in node 'folder') of a mixed complex type without an XSD/DTD schema you can use the tool eXtractorONE (eXtractor.ONE). No programming needed, no Xquery, no XSLT, nearly zero configuration. No limit on size. Just point to the folder with XML documents, select your target database and run it.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply