March 11, 2009 at 6:56 am
A tally table is a table that simply has a run of consecutive numbers for more information please read this article:
http://www.sqlservercentral.com/articles/TSQL/62867/
----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley
:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]
March 11, 2009 at 9:15 am
Chris #2 is right. I've begun to rewrite some multiline functions as inline table functions and the difference in performance for large batches is astounding. You're using that function a half dozen times, so even if Jeff's solution helps, there is going to be room for improvement, if you can rewrite the function.
Jeff, I understand about pre-aggregating, but why is it so important to do it with a temp table, rather than a derived table? Just so you can create indexes?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 11, 2009 at 12:55 pm
I'm with Gail and Jeff and Bob.
If I were going to write a stored procedure from scratch to do what you're doing I would break it down into pieces and I wouldn't use a function in the WHERE clause (although in some cases the inline function may perform better than an INNER JOIN). I probably wouldn't even do a UNION, but 2 separate inserts into a temp table.
I would create temp tables using the de-stringing functions first. These would be for the IN/NOT IN found in the WHERE clause.
Then pre-aggregate the first part as Jeff suggested into a temp table using the first temp tables (the ones created by the de-stringing function) as an INNER JOIN instead of the IN, and a LEFT JOIN instead of NOT IN. Then insert another set of records for the UNION.
You now have your complete recordset that you can play around with.
Then I'd play around with the LEFT JOIN (NOT IN) and see if NOT EXISTS in the WHERE clause would work faster. A little more than half of the time it does perform faster.
I've found that in almost every case breaking something down into smaller temp tables is faster than doing a monster query that does things in one fell swoop.
Todd Fifield
P.S. Be wary of the NOLOCK. I actually had a report that would periodically return 2 rows instead of 1 for an item in a high transaction environment. I'm not sure how this happend, but when I got rid of the NOLOCK on the heavily inserted/updated table, the problem never came back.
March 11, 2009 at 1:37 pm
tfifield (3/11/2009)
P.S. Be wary of the NOLOCK. I actually had a report that would periodically return 2 rows instead of 1 for an item in a high transaction environment. I'm not sure how this happend, but when I got rid of the NOLOCK on the heavily inserted/updated table, the problem never came back.
http://www.sqlservercentral.com/Forums/FindPost673605.aspx
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 11, 2009 at 2:12 pm
A lot of our developers just LOVE "NOLOCK".
If you ask them what it means, they usually respond that it means you want to go fast. Dirty reads? They don't understand the concept.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 11, 2009 at 4:56 pm
Gail,
Your post on NOLOCK has cleared up one of the big mysteries of the universe for me. Thanks
Todd Fifield
March 11, 2009 at 9:28 pm
Bob Hovious (3/11/2009)
Jeff, I understand about pre-aggregating, but why is it so important to do it with a temp table, rather than a derived table? Just so you can create indexes?
No... the reason is to reduce the row count that you need to join against. An aggregated derived table, much like a view, will still be a part of the query and the joins will be done on the underlying tables, not the result set of the aggregation. The result is comparative blinding speed.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 11, 2009 at 11:56 pm
Hi,
i worked out what my problem is!!!!
it wasnt my temp table apparently the index on that runs quite fast. The problem was when I hooked it into the chart of accounts table was when problems start to occur. Currently my temp table connects to the chart of account table via a surrogate key that belongs to chart of accounts. The clustered index is on some sort of account code and and there is a non clustered index against the surrogate key. I am trying to work out what I can do to make it any faster.
March 12, 2009 at 8:07 am
Many thanks, Jeff. Good to know. In the past (for really big sets) I usually created indexed temp tables and then loaded them, so the join would be optimal. But I was wondering if I was wasting time by not simply doing derived tables.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 12, 2009 at 9:13 am
i worked out what my problem is!!!!
I am trying to work out what I can do to make it any faster
Great 🙂 Thanks for letting us all know.
Even after you decide what you can and can't do about that surrogate key, don't overlook the other tips you received in this thread. Jeff's aggregation suggestion and the tips about inline functions are worth testing out for this and future projects.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 12, 2009 at 2:32 pm
I break into a sweat just imagining I am the one stuck with debugging this mammoth of an sql statement...
Note that the divide and conquer comments come from people with multi-thousand points (how in hell did they manage to get so many anyway). Such people have seen a lot of water go under the bridge and still they would not code like this.
Three other advantages to the old divide and conquer approach:
Breaking it down in simpler steps makes it easier to swallow (just as how to swallow an elephant - one byte at a time). A short comment as to what each step is intended for would be nice.
Also, when developping a T-Rex like that, intermediate steps make it easy to insert print staements in between to trace where it goes wrong. And, conversely, to gain confidence that the results obtained make sense.
Finally, you could output the current datetime after each such step, making it all that more easier to find out the performance killer step. And since that step is simpler, a remedy is easier to implement and verify for effectiveness.
I do not dispute that there is a businees need for the information, just that faced with code like this and a tight deadline to fix it, I would probably rebel. Maybe that kind of a situation would be reserved for punishment/revenge on some poor slob who has displeased the boss...
Just out of curiousity, how many coder-hours were required to make this work and verify the validity of the results ?
March 12, 2009 at 3:17 pm
too many....
the problem is that I inherited these from someone else who assured me that problematic table was fine. I did not assume anything was wrong with it little did I know...
I say about a week now.
March 12, 2009 at 3:29 pm
so...
you did pee in your boss' cornflakes ...
March 12, 2009 at 3:31 pm
man had i known ...
I would have done a dump in it as well
March 12, 2009 at 3:37 pm
:laugh:Now that's funny.:laugh: A good way to end a boring day converting data from a foreign RDMS system and having to wait 20 minutes for the sp to complete and then find how to kix my mistakes.
(I can see how a funny guy like you can get in trouble with authority - Naw, just kidding :smooooth:
Viewing 15 posts - 31 through 45 (of 46 total)
You must be logged in to reply to this topic. Login to reply