November 25, 2013 at 10:55 am
Hi everybody,
I have discovered a strange performance issue. I have a package (although the behaviour is similar for my other packages) that when run on the server (through an execute package task) it runs dramatically slower than when run directly in SSDT.
I have been looking at SQL Server Profiler, and it seems to run slower in a Data FLow task containing an OLEDB source, simple Derived column (stagetime), and a OLEDB Destination. The SQL Server Profiler contains a lot of pairs of SQL:BatchStarting/SQL:BatchCompleted with a 'insert bulk ....' statement. When run directly the data transfer takes 20 seconds, but when run through the execute package task it takes almost 40 minutes; i.e. more that a factor of 100.
I do not have much experience troubleshooting this sort of thing, so please bear with me.
Any pointers in the right direction is much appreciated.
Thanks in advance, René
November 25, 2013 at 2:34 pm
There's something strange going on. It isn't normal for the Execute Package Task to slow packages down. Are you executing other packages in parallel?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 26, 2013 at 12:39 am
Hi again.
@Koen:
I running the Execute Package Task with 'execute task' through SSDT, so I think it shouldn't be running in parallel.
More info on the problem:
I have, with help from the sharp eyes from a colleague, discovered that the simple data flow task (OLEDB source -> Derived Column -> OLE Destination) behaves differently when run on server and run in SSDT. When run directly through SSDT, each of the bulk inserts takes 18-37 ms, and the next starts 3-20 ms afterwards (less than 30.000 moved in 15 sec in total). When the package is run through Execute Package, the steps again takes about 16-25 ms (the variance actually seems smaller), but the next step starts about 5 seconds (YES, seconds), which seems to be the major difference here. This copy process takes about 40 minutes for the less than 30.000 rows.
I hope someone can point me in the right direction.
Thanks in advance, René
November 26, 2013 at 12:46 am
I'm not 100% following regarding the set-up of your package.
but the next step starts about 5 seconds
What is the "next step"? Another transformation in the data flow? Another package called by Execute Package Task?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 26, 2013 at 2:10 am
Hi Koen,
Sorry about my unclear additional information. I'll go back a bit, just to be sure we are on the same page.
I have a package, pkgA, which contains (among other things, but they do not behave differently) a DataFlow task, this data flow task is quite simple: OLEDB Source -> Derived Column transformation (simply getdate()) -> OLEDB Destination.
I run this package using two different approaches:
- Directly through SSDT (Execute package in solution view). The package completes in about 20 seconds.
- Deploy pkgA to the SSIS server, and create another package pkgB, containing 1 task; a Execute Package Task referencing the package in the SSIS Server. When running pkgB this takes about 40 minutes to complete.
I did a little googling and found that SQL Server Profiler might be my friend, and with that I started a simple trace. I discovered that the OLEDB Source resulted in one SQL:BatchStarting 'SELECT * FROM TableA', and the OLEDB Destination resulted in multiple (a lot) of SQL:BatchStarting/SQL:BatchCompleted Events with 'INSERT BULK TableB(Col1 type1,col2 type2,...' (TextData column in SQL Server Profiler). With this I discovered the different patterns between the two execution approaches.
The patterns are:
- When run directly through SSDT, each of the bulk inserts takes 18-37 ms, and the next batch starts 3-20 ms afterwards (less than 30.000 moved in 15 sec in total).
- When the package is run through Execute Package, each of the bulk insert steps again takes about 16-25 ms (the variance actually seems smaller), but the next step starts about 5 seconds (YES, seconds), which seems to be the major difference here. This copy process takes about 40 minutes for the less than 30.000 rows.
Hope this clarifies my setup
Thanks for your time so far, René
November 26, 2013 at 2:13 am
What if you run pkgA directly on the server, without pkgB starting it?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 26, 2013 at 2:57 am
Hi Koen,
When running it with 'Execute Package Utility' on the server, it finishes in 15 seconds, so similar to doing it in SSDT.
Shouldn't it be equivalent to executing it through an Execute package task?
TIA, René
November 26, 2013 at 3:34 am
It should be equivalent yes.
What option is selected in the Execute Package Task for ExecuteOutOfProcess?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 26, 2013 at 4:25 am
ExecuteOutOfProcess is set to false (don't remember whether I have changed this at some point)
November 26, 2013 at 4:36 am
René Manggaard (11/26/2013)
ExecuteOutOfProcess is set to false (don't remember whether I have changed this at some point)
False is the default, so you didn't change anything 🙂
Are you using the project or package deployment model?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 26, 2013 at 4:38 am
package deployment model
November 26, 2013 at 4:41 am
René Manggaard (11/26/2013)
package deployment model
OK, this rules out anything different introduced by the project deployment model.
What are your data flow settings for the buffer sizes and what are the settings in the OLE DB Destination for the fast load?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 26, 2013 at 4:51 am
Data Flow Task:
DefaultBufferMaxRows: 10000
DefaultBufferSize: 10.485.760
OLEDB Destination:
AccessMode: OpenRowset Using FastLoad From Variable
FastLoadMaxInsertCommitSize: 2.147.483.647
FastLoadOptions: TABLOCK,CHECK_CONSTRAINTS
November 26, 2013 at 4:57 am
René Manggaard (11/26/2013)
Data Flow Task:DefaultBufferMaxRows: 10000
DefaultBufferSize: 10.485.760
OLEDB Destination:
AccessMode: OpenRowset Using FastLoad From Variable
FastLoadMaxInsertCommitSize: 2.147.483.647
FastLoadOptions: TABLOCK,CHECK_CONSTRAINTS
What if you triple DefaultBufferSize and set the DefaultBufferMaxRows to 40000?
I'm curious if we can cheat SSIS in sending only one batch to the destination and thus eliminating those 5 second waits between batches.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 26, 2013 at 7:32 am
Hi Koen,
It does not seem to do the trick, still the 5 seconds between the batches.
I tried creating a couple of new packages with the same structure, but same executiontime pattern.
I tried running the master package on the server, and it only takes about the 20 seconds, so it could be something to do with the Execute Package Task running inside SSDT, just a wild guess...
René
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply