May 31, 2013 at 1:13 am
I tried to parse a huge xml file using sql server (OPENROWSET) into a database table, but got an error that says duplicate attributes (author). The author column varies for each article, ranging from one author to upto six authors. The columns I am really interested in are author, title and journal. Below is a snippet of the xml data. Any suggestions will be highly appreciated. Thanks in advance.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE dblp SYSTEM "dblp.dtd">
<dblp>
<article mdate="2002-01-03" key="persons/CoddD74">
<author>E. F. Codd</author>
<author>C. J. Date</author>
<title>Interactive Support for Non-Programmers: The Relational and Network Approaches.</title>
<journal>IBM Research Report, San Jose, California</journal>
<volume>RJ1400</volume>
<month>June</month>
<year>1974</year>
</article>
</dblp>
May 31, 2013 at 2:54 am
Hi... Have you tried something like below.. This works fine with your example xml:
DECLARE @xml XML
SELECT @xml = CONVERT(XML, (SELECT * FROM OPENROWSET(BULK N'c:\testxml.xml', SINGLE_BLOB) AS xmldocument), 2)
SELECT t.c.value('(./text())[1]', 'varchar(30)')
FROM @xml.nodes('//author') AS T(c)
May 31, 2013 at 6:02 am
Thanks for the response, it worked, but I would like the output to be in the formatt below. All the authors of a particular article should be together in a column, separate by comma, and the title as well as the journal should occupy the next two columns. The authors of a particular article, title and journal should be just one record. Thanks in advance for your suggestions.
Author (s)
E F Cod, Patric A V Hall, Markus Tresch
May 31, 2013 at 7:13 am
I wrote a blog on something similar a while back: http://www.olcot.co.uk/sql-blogs/using-xquery-to-create-a-single-xml-node-with-comma-separated-string-from-multiple-xml-nodes
here is one of a few ways to comma separate the node author for you:
DECLARE @xml XML
SELECT @xml = CONVERT(XML, (SELECT * FROM OPENROWSET(BULK N'c:\testxml.xml', SINGLE_BLOB) AS xmldocument), 2)
SELECT t.c.query('for $x in author
return
if (not( ($x) is (author[last()])[1] )) then
concat($x/text()[1], ",")
else
string($x/text()[1])').value('.', 'varchar(200)')
, t.c.value('(title/text())[1]', 'varchar(150)') AS 'title'
, t.c.value('(journal/text())[1]', 'varchar(100)') AS 'journal'
FROM @xml.nodes('//article') AS T(c)
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply