November 1, 2010 at 6:44 am
I was sent a script by our vendor to update some data from an XML file:
declare @xml_data xml
declare @invoices table (tran_id int)
select @xml_data = casts(bulkcolumn as xml) from openrowset(bulk, n'some_file.xml', single_blob) as inv
insert @invoices select mytable.mycol.value('.', 'INT') from @xml.data.nodes('/DocumentElement/Table/Tran_ID') as mytable(mycol)
some_file.xml is 330 megabytes.
I kicked this off 12 hours ago and it is still executing. Previously the same code was used for a much (MUCH) smaller file and still took about an hour.
Can anyone comment on why the performance of this is so bad?
I can perform the same operation by using regex on the source XML file in textpad to turn it into a simple columnar list and then doing a regular bulk insert. This method takes a couple of minutes total. Our vendor can't provide that as "a solution" though, since the solution they provide must be able to be executed without requiring that manual step of reformatting the file through regex.
November 1, 2010 at 10:01 am
I would store that xml file in a table with an xml column and an identity col.
Add an XML index to that table. The insert should perform much better.
Another option would be using OPENXML without casting the data to xml.
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply