February 1, 2010 at 10:33 am
Another thing, statistics could be different between the two systems, and this could affect the query plan generated in production compared to UAT.
February 1, 2010 at 1:21 pm
Jeff Moden (1/29/2010)
I forgot about that and that's very true. Have you ever seen code that does that, though?
Another use is in auditing... if you want to audit what was attempted, regardless of whether it succeeded.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 1, 2010 at 9:00 pm
Another good idea. Thanks Wayne. Just curious though... have you ever used it that way?
--Jeff Moden
Change is inevitable... Change for the better is not.
February 2, 2010 at 12:01 am
Rule of Thumb for Normal Stuff: avoid table variables until such time as you can prove that you are having recompilation problems.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 2, 2010 at 2:47 am
The myth about the Optimzer always treating table variables as if they have one row...!!! :w00t:
So ok yes, if you ask for an 'estimated' plan from SSMS you'll see one row. Cool. This can be extremely useful in some plans with poor cardinality estimates...but that's another story.
Everyone always forgets that at execution time, the engine will know how many rows are in the table variable if anything in the query triggers a recompilation, or if OPTION (RECOMPILE) is used.
Try this:
DECLARE @OneRow
TABLE (row_id BIGINT PRIMARY KEY);
INSERT @OneRow (row_id)
SELECT TOP (150000)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM master.sys.columns C1,
master.sys.columns C2;
SELECT COUNT_BIG(*)
FROM @OneRow OR1
JOIN @OneRow OR2
ON OR1.row_id = OR2.row_id
OPTION (RECOMPILE);
That should give a MERGE JOIN plan. If you add a HASH JOIN hint, you should even get a parallel plan 🙂
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 2, 2010 at 5:45 am
Thanks for the Myth Busting there, although I will point out the obvious in that you get the 'right' plan with temp tables without the need for the OPTION (RECOMPILE). 🙂
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 2, 2010 at 9:02 am
Thanks for all the suggestions that every one has given me. I have replaced the code for table variables with temp tables. Now it is running fine in prod in around 1min.
Thanks
San
February 2, 2010 at 9:26 am
Jeff Moden (2/1/2010)
Another good idea. Thanks Wayne. Just curious though... have you ever used it that way?
No, but I read about it and realized it would work pretty good.... I think I read about it on this site somewhere...
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 2, 2010 at 9:29 am
san43 (2/2/2010)
Thanks for all the suggestions that every one has given me. I have replaced the code for table variables with temp tables. Now it is running fine in prod in around 1min.Thanks
San
Good to hear that it's working better now.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 2, 2010 at 10:39 am
TheSQLGuru (2/2/2010)
Thanks for the Myth Busting there, although I will point out the obvious in that you get the 'right' plan with temp tables without the need for the OPTION (RECOMPILE). 🙂
True, but then 'real' temp tables also use locks and consume log. Horses for courses. There is no one right answer, and a lot of it comes down to personal preference. Except in this case and many like it LOL
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 4, 2010 at 7:07 pm
My personal rule of thumb has been to never use a table variable if more than 500 rows. This posting by the SQL Server team says never use a table variable if more than 100 rows.
http://www.sqlteam.com/article/temporary-tables
Peter
http://seattleworks.com including my blog The SQL Janitor
February 4, 2010 at 8:30 pm
Peter Samson (2/4/2010)
My personal rule of thumb has been to never use a table variable if more than 500 rows. This posting by the SQL Server team says never use a table variable if more than 100 rows.
That is not from the SQL Server Team. It is merely a random article, and one which perpetuates the myth that table variables use memory and tempdb any differently from regular temporary tables. Both have pros and cons - complex ones - applying a number-of-rows rule seems overly simplistic.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 4, 2010 at 9:01 pm
I agree with Paul on this one. There is no real set number of rows per query that works better in one than the other. However, there is a tipping point from which you will see improved performance by switching from one to the other.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 5, 2010 at 5:06 am
The inability to have statistics on table var (except for PK if you have one) can be a killer even if there is just ONE row in the table var. Imagine a table with 1 Billion rows that you are going to join to. The join column has 99.999% of the values as '123456' and the remaining values are all different. Now you take your one row table var and join on that column. If the table var row has '123456' in it that nested loop plan will crush your server. You want a hash/merge there.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 5, 2010 at 5:38 am
TheSQLGuru (2/5/2010)
The inability to have statistics on table var (except for PK if you have one) can be a killer even if there is just ONE row in the table var. Imagine a table with 1 Billion rows that you are going to join to. The join column has 99.999% of the values as '123456' and the remaining values are all different. Now you take your one row table var and join on that column. If the table var row has '123456' in it that nested loop plan will crush your server. You want a hash/merge there.
I wouldn't mind the highly-biased set up if it actually demonstrated your point 😛 😀 😛
Your assumption that the QO would choose a loop join does that component a great injustice. Assuming distribution statistics are available for the large table, the QO will optimize the query quite nicely on the basis that there are very few distinct values.
In fact, for a simple SELECT * a hash join *is* produced with a table variable! Trying an aggregate over the values produces a partial aggregate and a hash join! Both run in milliseconds. If you don't want to take my word for it, run the code...
USE tempdb;
GO
CREATE TABLE dbo.Data
(
row_id INTEGER IDENTITY PRIMARY KEY,
value INTEGER NOT NULL,
);
INSERT dbo.Data WITH (TABLOCK)
(value)
SELECT TOP (750000)
value = 123456
FROM master.sys.columns C1,
master.sys.columns C2,
master.sys.columns C3;
INSERT dbo.Data
(value)
VALUES (95);
DECLARE @OneRow
TABLE (
row_id INTEGER IDENTITY PRIMARY KEY,
value INTEGER NOT NULL
);
INSERT @OneRow VALUES (123456);
-- Also try MAX(D.value), AVG(D.value) and so on instead of the SELECT star
SELECT *
FROM @OneRow R
JOIN dbo.Data D
ON D.value = R.value;
DROP TABLE dbo.Data;
Once again with feeling: the choice between table variables and temporary tables involves some complex considerations. Rules of thumb generally suck (quite pleased with the imagery there!)
Cheers
Paul
P.S. Answers on a postcard for the business need behind the join described, with a SELECT * output 😀
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
Viewing 15 posts - 16 through 30 (of 36 total)
You must be logged in to reply to this topic. Login to reply