"Normal" tables vs local temporary (i.e. #) tables

Question

Post reply

"Normal" tables vs local temporary (i.e. #) tables

julian.fletcher

SSCrazy

Points: 2965
More actions
July 20, 2011 at 2:45 am

#100398

I'm working on an application based on a SQL 2005 database. Part of the functionality involves doing some fairly complicated and convoluted calculations which are carried out in the database by a number of separate SPs.
As a technique of passing data between these SPs, we've implemented a number of additional tables in the database. These are generally a) emptied (for safety), b) filled, c) used and finally d) emptied again. (Note that "used" includes updates to the data in the tables as well as in the tables which hold the final results.) Each table has a SPID column so that the rows being used by concurrent processes can be identified. (i.e. We have "WHERE SPID = @@SPID" everywhere.)
Inevitably, we're getting deadlocking problems under medium to heavy load. We had hoped that the SPID column would allow us to avoid this. We are therefore considering getting rid of these "normal" tables and instead using # tables of the same schema (but without the SPID column of course). This would surely (?) eliminate the deadlocking we've been seeing but I'm worried about other types of performance problems that might result, given that, for example, # tables may end up in tempdb. The number of rows used by any single process is generally quite low though (e.g. up to 100 perhaps).
Does anybody have any experience of this sort of thing who could offer any advice? At the moment, we're just considering giving it a go to see what happens.
Thanks in advance. Julian.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Suresh B. SSC-Insane Points: 22986 More actions · Answer 1

In my openion, it is a good idea to use temp table instead of permanent table to avoid deadlock issue.

Make sure that the tempdb has enough number of data files of same size on separate physical disks to avoid congestion.

Grant Fritchey SSC Guru Points: 398911 More actions · Answer 2

Based on what you describe, yeah, you're better off with temp tables. Although, how were the physical tables indexed? Did you put a cluster on the SPID column? There's a good chance with the right kind of clustered index you might eliminate the issue.

If you do go with temp tables, the concern is tempdb because that is where they will go, even if you have enough memory. So you'd need to make sure you went through standard optimization of the tempdb.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

julian.fletcher SSCrazy Points: 2965 More actions · Answer 3

Thanks for your replies.

We have an index on the SPID column (of course!) but not a clustered one. I was concerned that this might make the situation worse, given that the value of SPID used for each set of inserted data will essentially be random (i.e. decreasing or increasing from previous / existing values). However, I will give this a go and if that doesn't improve matters greatly, I'll try the temp table approach.

The problem with this last idea is, as you say, tempdb. Unfortunately, the application is always deployed on servers over which we have little control. When you say "standard optimization of the tempdb", what do you mean? Our customers rarely have additional disks for splitting out tempdb. We can probably get it pre-sized, but that might just about be it!

Julian.

Grant Fritchey SSC Guru Points: 398911 More actions · Answer 4

There are several white papers available from Microsoft on how to configure your tempdb, but from the sounds of things it might not help. Pre-sizing is good, but it might be incomplete.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Jeff Moden SSC Guru Points: 1004433 More actions · Answer 5

julian.fletcher (7/21/2011)
Thanks for your replies.
We have an index on the SPID column (of course!) but not a clustered one. I was concerned that this might make the situation worse, given that the value of SPID used for each set of inserted data will essentially be random (i.e. decreasing or increasing from previous / existing values). However, I will give this a go and if that doesn't improve matters greatly, I'll try the temp table approach.
The problem with this last idea is, as you say, tempdb. Unfortunately, the application is always deployed on servers over which we have little control. When you say "standard optimization of the tempdb", what do you mean? Our customers rarely have additional disks for splitting out tempdb. We can probably get it pre-sized, but that might just about be it!
Julian.

I believe you're correct. Clustering on the SPID could make the problems much worse for inserts because of the relatively low cardinality and the number of page splits that I think clustering the SPID would cause.

It would be interesting to learn why you need so many stored procs. It almost sounds as if you're processing one row at a time and that would be a real problem especially if you're using something like a "sequence table" to create ID's.

The shift to Temp Tables would almost certainly eliminate your deadlock problems but that would probably be just the first step I'd take to get out of the proverbial woods. I'd follow that up with a re-examination of what the business requirements were and design a less contentious method of accomplishing them. Make no doubt about it, that's usually not an easy thing to do but its almost always more than worth it.

You don't happen to work for ACN, do you?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

julian.fletcher SSCrazy Points: 2965 More actions · Answer 6

Why do we need so many SPs? I hope it's because we're doing some fairly complicated calculations, the details of which I won't bore you with. Suffice to say, however, the business requirements are non-negotiable!

We're certainly not doing any significant amounts of RBAR - pretty much everything is set-based and it all whizzes along at a very satisfactory speed with just the one user. (However, we do have problems with the time taken to compile all those SPs - we get the "first run slow, subsequent runs fast" effect. We have implemented all the suggestions I can find in order to minimise SP recompilation.)

ACN? No - software company in Oxford. Can't find how to update my profile to that effect.

Rich Mechaber SSChampion Points: 11130 More actions · Answer 7

I'll run this up the flag pole - might not be relevant.

You said you're doing "complicated and convoluted calculations". Might you be using UDFs for that? If so, you might want to run a trace to see what the query engine is doing under the hood, b/c you may be getting RBAR without knowing it. Here's an article that does a good job explaining it: http://www.databasejournal.com/features/mssql/article.php/3845381/T-SQL-Best-Practices--Dont-Use-Scalar-Value-Functions-in-Column-List-or-WHERE-Clauses.htm.

If that is an issue for you, CLR or inline TVFs might be an alternative. Depends on which calculations you're doing.

Rich

julian.fletcher SSCrazy Points: 2965 More actions · Answer 8

Yes, we've got plenty of UDFs but we're slowly eliminating any which access data in tables and I'm pretty sure that any which do aren't involved in the calculations in question.

julian.fletcher SSCrazy Points: 2965 More actions · Answer 9

Update. I made a copy of the original database and then developed the copy to use # tables instead of 'real' tables for much of the calculation. (That took a day or so.) I then ran five simultaneous instances of "do some calculation" on each database.

Original; 2 completed within about 4 minutes, 3 deadlocked.

New, improved, version; all 5 completed, but taking 30 to 40 minutes!

I haven't had time to work out exactly what's going on in the 'improved' version, but will let you know as and when I find anything interesting or illuminating.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

Indexes on your temp tables?

Optimised TempDB setup?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

julian.fletcher SSCrazy Points: 2965 More actions · Answer 11

Yes, all the same as were on the original 'real' tables.

None at all - any recommendations would be very gratefully received; pre-sizing, autogrowth settings, file location?

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 12

Pre-size it based on how big it gets in test runs. Sensible autogrow settings for the size (eg don't have a 10GB file growing in 2MB increments). Files on a fast set of drives separate from other data or log files (RAID 10 best here), start with no of data files = 1/4*number of CPU cores.

You have used temp tables (#tbl) not table variables (@tbl)?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

julian.fletcher SSCrazy Points: 2965 More actions · Answer 13

Yes, most of these were my next steps. Unfortunately, however, I can't get the files on separate drives (let alone fast ones). And yes, I implemented # tables, not @ tables; many are used by separate stored procedures (i.e. A calls B calls C etc., all accessing the same # table) and I also wanted to keep the indexes we had on the original 'normal' tables. (The existing code base already makes use of some # tables and some @ tables.)

I'll see if changes to the tempdb improve matters.

Thanks for your suggestions.

"Normal" tables vs local temporary (i.e. #) tables

Cookies on SQLServerCentral