March 6, 2008 at 7:14 am
Nisha (3/6/2008)
Thanks a ton, Jeff... I'm so relieved.I thought I was gonna have to re-write my import process, which took me ages to write in the first place.. what with all the variations in requirements time and again.
One last question, please.. what should I be looking into now for optimization?
My Product Development Manager says the process takes way too much time.. like 0.5-2.5 mins (depending on the number of records).
I don't really know what more I could do although I'm sure there's plenty that could be done. Any suggestions as to where I could start from?
Or should I attach the code here?
Its 1600 lines long :pinch: I doubt anyone would be able to spare enough time to go through it :doze:
It's all in the packaging.... You're right about the 1600 lines. Just try to find things you think are running slow (you can put in little tests to tell you which points are fast/slow), and dole those out in bite size pieces. I think you will find fairly substantial help if you do it that way:)
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
March 6, 2008 at 7:26 am
Nisha, 2000 rows is nothing. Use Select/Into. Look at my previous post with the test code. Select/Into will load 2000 rows in less than 15 milliseconds. JUST in case you don't know what a millisecond is... take 1 second... divide it into 1000 pieces... now take 15 of those pieces... that's not very long. I'm not sure why you need 30 temp tables to process 2000 rows, but if the biggest of the tables is only 2000 rows, you have no worries.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 6, 2008 at 7:34 am
Jacob Luebbers (3/6/2008)
Interesting Jeff - just compared your test query (1m rows) with slightly modified version that CREATEs #MyHead first and INSERTs into it (both running in TempDB). This is on a busy server so times are inaccurate, however:SELECT... INTO... < 5 s
CREATE... INSERT... > 10 s
Adding a TABLOCK hint on the second version didn't sem to help the runtime measurably. Can you (or anyone else) comment on the reasons for this difference?
To your last comment - 100% agree given these results. The answer to "should I use SELECT... INTO... or CREATE... INSERT... will depending on the frequency of execution rather than a hard and fast rule.
Regards,
Jacob
Yes... what happens when you insert into a table? Correct... it writes to the table and the transaction log. What happens when you do a SELECT/INTO? Same thing as if you did a BULK INSERT correctly... transaction log says nothing but the fact that you loaded N rows. Done correctly, SELECT/INTO doesn't write much to the transaction log and that's the biggest difference performance wise. There're a few other things that SELECT INTO does... Insert first writes to a working table (think about the INSERT table in triggers...). SELECT INTO doesn't unless you tell it to. In fact, you have to tell SELECT INTO to fire triggers if you actually want a trigger to fire on the target table as part of the SELECT INTO. Otherwise, they won't normally fire for a SELECT INTO.
So, overall, SELECT INTO has about 1/3 the overhead when compared to an INSERT.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 6, 2008 at 7:37 am
Jeff Moden (3/6/2008)
Nisha, 2000 rows is nothing. Use Select/Into. Look at my previous post with the test code. Select/Into will load 2000 rows in less than 15 milliseconds. JUST in case you don't know what a millisecond is... take 1 second... divide it into 1000 pieces... now take 15 of those pieces... that's not very long. I'm not sure why you need 30 temp tables to process 2000 rows, but if the biggest of the tables is only 2000 rows, you have no worries.
Thanks, Jeff.I did look into the post. And yes, I see the difference!!!
Understood what the milliseconds are all about too 😉
By the way, I need all those temp tables for:
WorkOrders, Surveys, Locations, Providers, Agency, Address, Contract, Ad, Crew, Employee, ANI... and so on..
March 6, 2008 at 7:54 am
Ok... was just making sure. You should have no problems with the temp tables.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 6, 2008 at 9:03 am
Thanks a ton, Jeff... I'm so relieved.
I thought I was gonna have to re-write my import process, which took me ages to write in the first place.. what with all the variations in requirements time and again.
One last question, please.. what should I be looking into now for optimization?
My Product Development Manager says the process takes way too much time.. like 0.5-2.5 mins (depending on the number of records).
I don't really know what more I could do although I'm sure there's plenty that could be done. Any suggestions as to where I could start from?
Or should I attach the code here?
Its 1600 lines long :pinch: I doubt anyone would be able to spare enough time to go through it :doze:
March 6, 2008 at 9:50 am
Thanks for the insight Jeff. I've done some re-checking of my old assumptions and it turns out the TempDB locking contention from SELECT... INTO... was only a major issue in 6.5, and 7.0 onwards was much more granular in its behaviour. And the first version of SQL Server I ever used was 7.0, so there's no excuse :blush:
Though I still wouldn't do it for long running INSERTs or with a high frequency - at least not into TempDB. Maybe I'm just paranoid, but personally I'd create a staging DB in Simple recovery mode for these inserts if you use SELECT... INTO... rather than use TempDB.
Sorry for steering you wrong Nisha with some old, no-longer-relevant "advice" 😛
Regards,
Jacob
March 6, 2008 at 12:54 pm
Jeff Moden (3/6/2008)
Grant Fritchey (3/5/2008)
What this is doing is using the temp table in place of a join or joins. That's bad.Ummm... I gotta say, "Not always". I've used temp tables in the exact fashion you described to break up monsterous single SELECTs with multiple "haywire" joins in a "Divide and Conquer" fashion. The end result was that I was able to convert, for example, a 30 minute run into a 6 second run... and, no, that's not a type-o. 🙂
Damn, did I come off absolute... Of course, it depends. Sorry about that. I've seen places where it works well too. The problem is, I see so many places where its done badly that I default to that position.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 6, 2008 at 2:45 pm
Jacob Luebbers (3/6/2008)
Thanks for the insight Jeff. I've done some re-checking of my old assumptions and it turns out the TempDB locking contention from SELECT... INTO... was only a major issue in 6.5, and 7.0 onwards was much more granular in its behaviour. And the first version of SQL Server I ever used was 7.0, so there's no excuse :blush:Though I still wouldn't do it for long running INSERTs or with a high frequency - at least not into TempDB. Maybe I'm just paranoid, but personally I'd create a staging DB in Simple recovery mode for these inserts if you use SELECT... INTO... rather than use TempDB.
Sorry for steering you wrong Nisha with some old, no-longer-relevant "advice" 😛
Regards,
Jacob
Oooohhhh... that's right... I forgot all about the legend of 6.5. You are correct... using tempdb in 6.5 was a real testy situation... one wrong move in tempdb, like using a SELECT/INTO, would bring a server right to it's knees! Those legends live on and have been carried forward to the later versions so much so that there are young DBA's who never even used SQL Server 2000 never mind 6.5 that still have heard the legend and will not allow the use of TempDB. Yeah, I fogot all about that...
Jacob, I really appreciate you coming back like you did... I've been trying to teach my guys that you earn a hell of a lot more respect if you come back and say "Guess what? I was wrong... here's what I found out." Well done, Jacob!
Yes, I agree... if you're going to do a lot of ETL, a separate DB with SIMPLE recovery is certainly in order. In fact, if you really do a lot of ETL, it may justify a separate box, altogether. Obviously, it doesn't have to be on steriods, but a separate box with a single or two user license may be well worth it.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 6, 2008 at 3:30 pm
Cheers Jeff! There are quite a few articles around that simply say "don't SELECT... INTO... a temp table" without qualifying it with "but only really a problem in 6.5", so it's one of those Mary Shelley-esque urban legends that persists without anyone really knowing why or double-checking the facts. A couple even on this site (http://www.sqlservercentral.com/articles/Performance+Tuning/temptabl/662/[/url] and http://www.sqlservercentral.com/articles/Basic+Querying/temptablesinsqlserver/1279/[/url]) - I know I have always blindly followed that advice until now.
Thanks as always for the sound testing methodology!
Regards,
Jacob
March 6, 2008 at 4:37 pm
Jacob Luebbers (3/6/2008)
Cheers Jeff! There are quite a few articles around that simply say "don't SELECT... INTO... a temp table" without qualifying it with "but only really a problem in 6.5", so it's one of those Mary Shelley-esque urban legends that persists without anyone really knowing why or double-checking the facts. A couple even on this site (http://www.sqlservercentral.com/articles/Performance+Tuning/temptabl/662/[/url] and http://www.sqlservercentral.com/articles/Basic+Querying/temptablesinsqlserver/1279/[/url]) - I know I have always blindly followed that advice until now.Thanks as always for the sound testing methodology!
Regards,
Jacob
I call it the "Five Monkeys Syndrome"... supposedly, they've actualy done such a test...
http://www.safetycenter.navy.mil/Articles/a-m/monkeys.htm
And thank you for the compliment about testing. One good test is worth a thousand theories...
--Jeff Moden
Change is inevitable... Change for the better is not.
March 6, 2008 at 5:07 pm
Jeff Moden (3/6/2008)
Yes, I agree... if you're going to do a lot of ETL, a separate DB with SIMPLE recovery is certainly in order. In fact, if you really do a lot of ETL, it may justify a separate box, altogether. Obviously, it doesn't have to be on steriods, but a separate box with a single or two user license may be well worth it.
Amen to that one.. It's amazing how much less of a problem it is when your ETL takes a few more minutes when it's on its own box not "bothering anything else..."
Besides - you run it out of disk space/kill your tempdb with a "bad ETL", then just reboot the box. Amazing how much cleaner that gets....:)
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
March 6, 2008 at 6:03 pm
Grant Fritchey (3/6/2008)
Jeff Moden (3/6/2008)
Grant Fritchey (3/5/2008)
What this is doing is using the temp table in place of a join or joins. That's bad.Ummm... I gotta say, "Not always". I've used temp tables in the exact fashion you described to break up monsterous single SELECTs with multiple "haywire" joins in a "Divide and Conquer" fashion. The end result was that I was able to convert, for example, a 30 minute run into a 6 second run... and, no, that's not a type-o. 🙂
Damn, did I come off absolute... Of course, it depends. Sorry about that. I've seen places where it works well too. The problem is, I see so many places where its done badly that I default to that position.
Heh... thanks Grant... to be even more succinct, it's not the method that's bad, it's the dummies that don't use the method correctly. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 16 through 27 (of 27 total)
You must be logged in to reply to this topic. Login to reply