October 15, 2014 at 9:59 am
I am bulk inserting a whole bunch of xml files into a sql table using OPENROWSET. The problem is that this process is really really slow. Any idea if this can be sped up?
October 15, 2014 at 10:06 am
peacesells (10/15/2014)
I am bulk inserting a whole bunch of xml files into a sql table using OPENROWSET. The problem is that this process is really really slow. Any idea if this can be sped up?
Quick question, what version and edition of SQL server are you on?
😎
October 15, 2014 at 2:44 pm
Eirikur Eiriksson (10/15/2014)
peacesells (10/15/2014)
I am bulk inserting a whole bunch of xml files into a sql table using OPENROWSET. The problem is that this process is really really slow. Any idea if this can be sped up?Quick question, what version and edition of SQL server are you on?
😎
Microsoft SQL Server 2012 , Enterprise
October 15, 2014 at 10:21 pm
Done this on exactly the same version/edition and the performance is brilliant so no worries there. One thing to check though is the configuration of tempdb, rule of thumb, recommend having the number of files the same as the cpu cores available for the instance and the files sizes fixed and equal.
😎
There are few other factors which can affect the performance, you'll have to provide some more information here;-) I'll start with some questions, don't limit the answers to the questions if you think I'm missing anything important.
1) How large are the files?
2) Any file larger than 2Gb?
3) Are the files loaded into a staging table before shredding?
4) Is the XML typed or untyped?
5) Is the XML structure simple-medium-complex?
6) Is the BulkColumn used directly in any XQuery?
7) Any XML Indices?
8) Are the files on a local drive (server)?
9) Has the file-system performance been checked?
October 16, 2014 at 11:11 am
Eirikur Eiriksson (10/15/2014)
Done this on exactly the same version/edition and the performance is brilliant so no worries there. One thing to check though is the configuration of tempdb, rule of thumb, recommend having the number of files the same as the cpu cores available for the instance and the files sizes fixed and equal.😎
There are few other factors which can affect the performance, you'll have to provide some more information here;-) I'll start with some questions, don't limit the answers to the questions if you think I'm missing anything important.
1) How large are the files?
2) Any file larger than 2Gb?
3) Are the files loaded into a staging table before shredding?
4) Is the XML typed or untyped?
5) Is the XML structure simple-medium-complex?
6) Is the BulkColumn used directly in any XQuery?
7) Any XML Indices?
8) Are the files on a local drive (server)?
9) Has the file-system performance been checked?
1. The files are not big around 300 KB on average.
2. No
3.Yes. Loading into the staging is what is taking a long time right now. The shredding would take as long if not longer.
4.Mostly untyped
5.The structure is complex
6.Yes, we get that column and shred it into sql tables.
7.Not yet (need suggestions on that one as well)
8.No on the network.
9.No idea and i don't know what that is.
Thank you, your suggestions would be much appreciated.
October 16, 2014 at 12:18 pm
First suggestion is to add one step where the files are directly loaded into a staging table as an XML blob, without any shredding or processing. This will give you the actual performance of the file share minus the write performance of the SQL Server. In my experience, 5-10Mb/sec at least. Using the BulkColumn directly has very adverse affects on the performance so lets try to avoid that one.
From that initial staging table, do the shredding, if you share some more details here we can help you optimize that step.
On the performance side of things, what you can expect is an overhead of around 50ms for initialization in addition to 5-10Mb/sec transfer rate for each file, somehow there seams to be a barrier there although using SSD type subsystems I've managed to get this much much higher. The shredding part normally is where the low hanging fruits resides so I'm looking forward to analyse your approach and hopefully improve it.
To sum up, can you post some XML examples and the code you are using to shred those?
😎
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply