June 15, 2009 at 3:02 am
Ewald Cress (6/14/2009)
Gail, this is a shot in the dark, but have a look at the relationship between the actual (17134) and estimate (5623). This is roughly 30%, which corresponds to the selectivity estimate for non-equijoins when there are no statistics.
Yeah, but the question is where SQL came up with the 5623 in the first place. At the point where that query is compiling SQL cannot know exactly how many rows will be in the resultset. If it could, there would be no need for an estimated row count, it would know the actual.
Also, if you change the value in the where clause predicate the estimate stays the same (despite the plan been recompiled) and the actual rows changes.
declare @t table (ID int )
INSERT INTO @t (ID)
SELECT top (100000) a.column_id
from master.sys.columns a cross join master.sys.columns b
-- EstimateRows is 5623, Actual is 17134
select * from @t where ID = 1
option (recompile)
-- EstimateRows is 5623, Actual is 9226
select * from @t where ID = 5
option (recompile)
-- EstimateRows is 5623, Actual is 1977
select * from @t where ID = 25
option (recompile)
No one I've spoken with knows exactly where the estimate is coming from. It's not the 30% of the table that's used with inequalities when a estimate can't be made. That would be 30 000 rows. Someone did point out that the estimates (which change as the total number of rows in the table variable changes) is approx (total rows)0.75, though I haven't done the calculations myself to verify that.
The bottom line is that the estimated rows is based on some form of calculation. The only thing that the optimiser knows at that point is the total rows in the table variable (which it gets from the storage engine since there are no distribution statistics)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 15, 2009 at 7:08 am
GilaMonster (6/15/2009)...No one I've spoken with knows exactly where the estimate is coming from. It's not the 30% of the table that's used with inequalities when a estimate can't be made. That would be 30 000 rows. Someone did point out that the estimates (which change as the total number of rows in the table variable changes) is approx (total rows)0.75, though I haven't done the calculations myself to verify that.
In this case the exponent is closer to 0.886.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
June 15, 2009 at 7:23 am
RBarryYoung (6/15/2009)
GilaMonster (6/15/2009)...No one I've spoken with knows exactly where the estimate is coming from. It's not the 30% of the table that's used with inequalities when a estimate can't be made. That would be 30 000 rows. Someone did point out that the estimates (which change as the total number of rows in the table variable changes) is approx (total rows)0.75, though I haven't done the calculations myself to verify that.
In this case the exponent is closer to 0.886.
I did say that I didn't check the calc.
Regardless of how it's calced, it's some magic number somewhere in the code of the optimiser.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 15, 2009 at 7:30 am
I checked it only because you said that you hadn't yet. Sorry, just trying to be helpful. 🙂
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
July 8, 2009 at 10:11 am
Excellent article. I particular like the "So, what should I use?" section.
I had a case where I had a process that used all table variables and to see if I could get better performance I switched it to all temp tables. This process had some built-in timing metrics processes so I could see how each "section" ran. What I found was kind or startling. The first part of the process ran VERY fast with table variables but took a HUGE hit on timing on the later parts. When run with temp tables, the first part's performance was significantly worse than table variables but was very good on the later parts. What I ended up doing was copying the data from the table variable to a temp table at the point performance started to dump. Even with this copy action the process ran about 4 times faster overall.
When dealing with fairly large record sets I often try it both ways just to see what is REALLY the best for the process, it usually is little more than a find/replace operation to switch it from one to the other for me.
CEWII
July 9, 2009 at 11:59 am
Very Nice Article.
I have one doubt about table variable.
Inside one of my SP, I am storing values in table variable to loop through the records.
If number of records inserted in table variable is too high(May be in millions), Does it degrade my SP's performance? Is it better to use temp table ?
Thanks
July 9, 2009 at 12:05 pm
Wish (7/9/2009)
Does it degrade my SP's performance? Is it better to use temp table ?
Probably and maybe.
Test the proc with a table variable, see how it runs. Test the proc with a temp table, see how it runs.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 9, 2009 at 5:08 pm
It *could* degrade performance. I would try it and see. As far as I can tell, there is no REAL hard and fast rule on whether to go one way or the other. The fact that it isn't rollbackable is convenient for logging tables within a sproc since my last step usually writes them to the database log table..
CEWII
July 9, 2009 at 5:24 pm
Wish (7/9/2009)
Inside one of my SP, I am storing values in table variable to loop through the records.If number of records inserted in table variable is too high(May be in millions), Does it degrade my SP's performance? Is it better to use temp table?
Unless you need some behaviour specific to table variables, a temporary table is most often a better choice.
Though it must be asked: why on earth would you have millions of records in a temporary object? :blink:
And what's all this about 'looping'? If this indicates use of a cursor, or cursor-like behaviour, there is almost always a better set-based solution.
I don't want to open a can of worms here, but looping through millions of records in a temporary object sounds like a deeply sub-optimal idea...
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
July 10, 2009 at 2:30 am
very good articel! thx
September 23, 2009 at 2:15 pm
I ran some of the speed tests using the SQL code from the article and I do not see much difference in performance (time-wise) in the queries... but the logical reads is WAY up for the table variable join; CPU time is almost double for the table variable. I can see how using a table variable in multiple joins can cause problems.
SERVER #1:
-----------
Table 'SalesOrderHeader'. Scan count 5, logical reads 130, physical reads 1, read-ahead reads 43, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#SalesOrderDetail___________________________________________________________________________________________________000000000074'. Scan count 5, logical reads 12330, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 2391 ms, elapsed time = 48931 ms.
select with temporary table: 50,576 ms
********************************************************************************
Table 'SalesOrderHeader'. Scan count 0, logical reads 764850, physical reads 119, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#60CC1BAA'. Scan count 1, logical reads 12330, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4375 ms, elapsed time = 23606 ms.
select with table variable: 23,616 ms
SERVER #2:
-----------
Table 'SalesOrderHeader'. Scan count 5, logical reads 130, physical reads 1, read-ahead reads 43, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#SalesOrderDetail___________________________________________________________________________________________________000000000074'. Scan count 5, logical reads 12330, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 2734 ms, elapsed time = 204,196 ms.
select with temporary table: 204,656 ms
********************************************************************************
Table 'SalesOrderHeader'. Scan count 0, logical reads 764850, physical reads 119, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#0BDB394A'. Scan count 1, logical reads 12330, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4687 ms, elapsed time = 207,662 ms.
select with table variable: 207,656 ms
SERVER #3:
-----------
Table 'SalesOrderHeader'. Scan count 9, logical reads 130, physical reads 1, read-ahead reads 43, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#SalesOrderDetail___________________________________________________________________________________________________000000000005'. Scan count 9, logical reads 12330, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 1866 ms, elapsed time = 5491 ms.
select with temporary table: 5,783 ms
********************************************************************************
Table 'SalesOrderHeader'. Scan count 0, logical reads 764850, physical reads 119, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#173876EA'. Scan count 1, logical reads 12330, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 3120 ms, elapsed time = 7717 ms.
select with table variable: 7,710 ms
Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)
September 23, 2009 at 3:07 pm
As far as logical reads of a table variable.. Keep in mind that if they are not too big they will probably reside fully in memory and the reads are much cheaper, also keep in mind that you can create a primary key when the table variable is created. This helps a lot.. But either way, do what performs best for YOU..
CEWII
September 23, 2009 at 4:22 pm
Elliott W (9/23/2009)
As far as logical reads of a table variable.. Keep in mind that if they are not too big they will probably reside fully in memory and the reads are much cheaper, also keep in mind that you can create a primary key when the table variable is created. This helps a lot.. But either way, do what performs best for YOU..CEWII
Just to clarify, if either the table variable or temporary table are small enough, they both can reside in memory.
Keep in mind that when the optimizer looks at the query, since it can't build statistics for table variables, it assumes that there is 1 record. Obviously, this can give you a bad query plan.
Personally, I've noticed that performance with table variables sinks real low when you start joining multiple tables to the query.
But either way, do what performs best for YOU
Absolutely. And test it against a representative data sample that you will be seeing in production.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
September 23, 2009 at 8:30 pm
WayneS I bring up an earlier post of mine to illustrate testing it..
I had a case where I had a process that used all table variables and to see if I could get better performance I switched it to all temp tables. This process had some built-in timing metrics processes so I could see how each "section" ran. What I found was kind or startling. The first part of the process ran VERY fast with table variables but took a HUGE hit on timing on the later parts. When run with temp tables, the first part's performance was significantly worse than table variables but was very good on the later parts. What I ended up doing was copying the data from the table variable to a temp table at the point performance started to dump. Even with this copy action the process ran about 4 times faster overall.
When dealing with fairly large record sets I often try it both ways just to see what is REALLY the best for the process, it usually is little more than a find/replace operation to switch it from one to the other for me.
I found by testing exactly how to make MY code perform best. There are a fair number of elements in SQL that CAN perform really well for you given a particular set of circumstances, and are totally inappropriate in others, the bright line between those cases and temp table vs. table variables is not particularly, uh bright..
CEWII
October 3, 2009 at 7:25 am
Elliott W (9/23/2009)
As far as logical reads of a table variable.. Keep in mind that if they are not too big they will probably reside fully in memory and the reads are much cheaper...
There's really no difference between the two in this regard.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
Viewing 15 posts - 61 through 75 (of 163 total)
You must be logged in to reply to this topic. Login to reply