November 3, 2011 at 12:26 pm
Jack Corbett (11/3/2011)
Stefan Krzywicki (11/3/2011)
ricer (11/3/2011)
Is the GUID your clustered index key as well?If so you are probably seeing lots of page splits when you are doing inserts.
Best way of speeding that up would be to change the clustered key to a different column or even using identity column so all inserts gets appended to the end.
What did you set the fillfactor on the clustered index on those tables?
Yup, it is. There's no way to make that change. I'm constrained by the structure of the rest of the system.
I left it at the default.
You could try changing the FILL FACTOR on the clustered index. You might also want to check wait stats to see if there are IO waits because of page splits. Take snapshots before and after and do a diff.
I'll add it to the list, thanks!
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
November 3, 2011 at 12:27 pm
Brandie Tarvin (11/3/2011)
I forgot to mention something.If push comes to shove and you can't use an Execute SQL Task, and absolutely must have a Data Flow Task instead, use a stored procedure as your OLE DB data source instead of a table. It is still a lot faster than pulling your data from a table, view, or plain query in the Source Transformation.
(Change the data access mode to SQL command, and call the proc with EXECUTE <myprocname>)
Interesting. I'll keep it in mind. Fortunately, there shouldn't be anything keeping me from changing to to an Execute SQL Task.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
November 3, 2011 at 12:56 pm
I had a similar question on Andy Leonard's latest Stairway article. I thought SSIS was specifically designed to streamline the ETL process, but it seems staging tables and updates in SQL are much, much faster. Kind of confuses me as to the purpose/advantage of building a package that does anything more than just loads a table from a file.
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
November 3, 2011 at 1:03 pm
jcrawf02 (11/3/2011)
I had a similar question on Andy Leonard's latest Stairway article. I thought SSIS was specifically designed to streamline the ETL process, but it seems staging tables and updates in SQL are much, much faster. Kind of confuses me as to the purpose/advantage of building a package that does anything more than just loads a table from a file.
The thing that SSIS can do well that T-SQL cannot is transform the data between hetrogenius (sp) sources and destinations. Also, divert error records and process them uniquely (it's hard do that in SSMS or a regular Agent job). Lastly, SSIS allows the merging of data from several different SQL Servers without having to use actual Linked Server objects.
I'm still not up on the 2k8 T-SQL advancements, but in 2005 at least the ability to do Fuzzy Matching / Lookups was Da Bomb (sort of. Hated the learning curve.)
SSIS is not meant to play with SQL Server only data. It's meant for bigger things. Does that help?
November 3, 2011 at 1:05 pm
jcrawf02 (11/3/2011)
I had a similar question on Andy Leonard's latest Stairway article. I thought SSIS was specifically designed to streamline the ETL process, but it seems staging tables and updates in SQL are much, much faster. Kind of confuses me as to the purpose/advantage of building a package that does anything more than just loads a table from a file.
I think some transformations are supposed to be faster in SSIS as well. I was given the impression that SSIS was the way to go based on that Microsoft white paper bragging about how fast they could load a terrabyte. I'm looking forward to doing tests to see what the speed difference is between the load types. I should be able to do it by Monday, I hope.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
November 3, 2011 at 1:16 pm
Stefan Krzywicki (11/3/2011)
jcrawf02 (11/3/2011)
I had a similar question on Andy Leonard's latest Stairway article. I thought SSIS was specifically designed to streamline the ETL process, but it seems staging tables and updates in SQL are much, much faster. Kind of confuses me as to the purpose/advantage of building a package that does anything more than just loads a table from a file.I think some transformations are supposed to be faster in SSIS as well. I was given the impression that SSIS was the way to go based on that Microsoft white paper bragging about how fast they could load a terrabyte. I'm looking forward to doing tests to see what the speed difference is between the load types. I should be able to do it by Monday, I hope.
A large part of the speed in loading a TB in 30 minutes was optimization of the disk. I seem to recall part of it also was breaking the workload up into smaller chunks - which might have some application in your load.
November 3, 2011 at 1:23 pm
November 3, 2011 at 1:32 pm
Greg Edwards-268690 (11/3/2011)
And if you haven't stumbled upon this already, you may want to have a peek.
Thanks, I'll give it a look.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
November 3, 2011 at 2:00 pm
jcrawf02 (11/3/2011)
I had a similar question on Andy Leonard's latest Stairway article. I thought SSIS was specifically designed to streamline the ETL process, but it seems staging tables and updates in SQL are much, much faster. Kind of confuses me as to the purpose/advantage of building a package that does anything more than just loads a table from a file.
For inserts, it's faster. For updates and deletes, it's not. The problem is the built in components in SSIS RBAR the changes into the tables. It's not built like the engine is. Transformation, transport, and delivery, yes. Modification of existing, not so much.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
November 3, 2011 at 2:05 pm
Evil Kraig F (11/3/2011)
jcrawf02 (11/3/2011)
I had a similar question on Andy Leonard's latest Stairway article. I thought SSIS was specifically designed to streamline the ETL process, but it seems staging tables and updates in SQL are much, much faster. Kind of confuses me as to the purpose/advantage of building a package that does anything more than just loads a table from a file.For inserts, it's faster. For updates and deletes, it's not. The problem is the built in components in SSIS RBAR the changes into the tables. It's not built like the engine is. Transformation, transport, and delivery, yes. Modification of existing, not so much.
Sorry, I'm a little confused by the phrasing of your response. "For inserts it's faster", which is faster?
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
November 3, 2011 at 3:48 pm
Stefan Krzywicki (11/3/2011)
Evil Kraig F (11/3/2011)
jcrawf02 (11/3/2011)
I had a similar question on Andy Leonard's latest Stairway article. I thought SSIS was specifically designed to streamline the ETL process, but it seems staging tables and updates in SQL are much, much faster. Kind of confuses me as to the purpose/advantage of building a package that does anything more than just loads a table from a file.For inserts, it's faster. For updates and deletes, it's not. The problem is the built in components in SSIS RBAR the changes into the tables. It's not built like the engine is. Transformation, transport, and delivery, yes. Modification of existing, not so much.
Sorry, I'm a little confused by the phrasing of your response. "For inserts it's faster", which is faster?
Heh, sorry, SSIS is typically faster for Inserts (though equivalent to BULK INSERT). It's horrendous for changes to existing records. Addendum: This assumes you have enough memory and are running the package on that server and... etc etc. I agree with most of the above, if you aren't going cross-server and don't need a particular tool in SSIS that isn't available in T-SQL, stay in T-SQL. 😀
I should learn not to post when I'm typing quickly.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
November 4, 2011 at 6:35 am
Evil Kraig F (11/3/2011)
I should learn not to post when I'm typing quickly.
Or when you have the lint ball monster from H3CK watching over your shoulder with it's evil googly eyes. 😀
November 4, 2011 at 6:48 am
Limiting the batch size to the same as the row commit seems to have helped a bit. I should have an environment set up over the weekend to try some of the other suggestions, like switching to T-SQL for the last step.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
November 4, 2011 at 5:14 pm
Stefan Krzywicki (11/3/2011)
Brandie Tarvin (11/3/2011)
I forgot to mention something.If push comes to shove and you can't use an Execute SQL Task, and absolutely must have a Data Flow Task instead, use a stored procedure as your OLE DB data source instead of a table. It is still a lot faster than pulling your data from a table, view, or plain query in the Source Transformation.
(Change the data access mode to SQL command, and call the proc with EXECUTE <myprocname>)
Interesting. I'll keep it in mind. Fortunately, there shouldn't be anything keeping me from changing to to an Execute SQL Task.
There is NO big difference in performance between reading directly from a table or using stored procedure. Why should it be at least 1% faster? It still has to read the whole table either from cache or disk. View, table, sp or ad-hoc query - it doesn't matter until you don't use any parameters.
November 4, 2011 at 5:22 pm
alexander.suprun (11/4/2011)
Stefan Krzywicki (11/3/2011)
Brandie Tarvin (11/3/2011)
I forgot to mention something.If push comes to shove and you can't use an Execute SQL Task, and absolutely must have a Data Flow Task instead, use a stored procedure as your OLE DB data source instead of a table. It is still a lot faster than pulling your data from a table, view, or plain query in the Source Transformation.
(Change the data access mode to SQL command, and call the proc with EXECUTE <myprocname>)
Interesting. I'll keep it in mind. Fortunately, there shouldn't be anything keeping me from changing to to an Execute SQL Task.
There is NO big difference in performance between reading directly from a table or using stored procedure. Why should it be at least 1% faster? It still has to read the whole table either from cache or disk. View, table, sp or ad-hoc query - it doesn't matter until you don't use any parameters.
You know, I remember hearing at one of the SQL Server seminars I've gone to that using the SSIS read from a specified table is slower than creating the query, but I don't remember why. I can believe that the stored procedure would be faster than either simply because it could have a stored query plan and the other methods would not.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
Viewing 15 posts - 16 through 30 (of 40 total)
You must be logged in to reply to this topic. Login to reply