September 12, 2016 at 5:31 am
Hi, I've got a strange issue where a cached execution plan has been lost following a reboot and the subsequent execution plan is (in my opinion) sub optimal.
A little more detail:
I have a (potentially) large table that needs filtering down based on several complex and unrelated criteria.
Therefore I've created a Temp Table ( CREATE TABLE #Filter (filterID BIGINT NOT NULL PRIMARY KEY) ) and I populate it with the Primary Key value of the large table for each criteria.
The result is a 5000 row table that I join to the large table and thus return only the rows I'm interested in.
5000 rows is about 10% of the larger table in my environment but the larger table is more likely to be many hundreds of thousands if not millions of rows, where as the 5000 is pretty constant in production. In my environment I've experimented with reducing the number of rows in the Temp Table to as low as 10 and it still has no affect.
The creation and population of the Temp Table and final query are all in 1 stored proc.
The first step in the execution plan of the final query used to be the joining of the two tables, causing an CLUSTERED INDEX SEEK on the larger table.
Following the reboot it performs a CLUSTERED INDEX SCAN on the larger table and doesn't join to the Temp Table until it's joined to some other tables that are required in the result set but will not result in reducing the results.
I have tried joining the two tables together using an INNER LOOP JOIN, and that does force it to consider the temp table earlier but it's still doing a CLUSTERED INDEX SCAN on the larger table first.
I have tried moving the Temp Table into the WHERE clause as an IN() constraint but even that still scanned the larger table first.
Statistics are upto date, stored proc has been recompiled, cache has been cleared but it always "ignores" the filter that I'm trying to create with my Temp Table.
Ideally I don't want to rewrite the query because it's been in production unchanged for years, but if I must then I must, but I really want to know why has it stopped using the Temp Table to force the larger table into a Clustered Index Seek.
Any guidance is greatfully received.
Thanks
Giles
September 12, 2016 at 5:42 am
did you try selecting from the temp table, and left joining the larger table?
from your description, it sounds like your first table is the larger table, and you inner join to that;
i think you'll get a better plan if the temp table is first, and the other data joins to it.
Lowell
September 12, 2016 at 6:01 am
Worked a treat, thanks very much.
September 12, 2016 at 6:03 am
We use this method quite a lot for breaking down processing queries running in production, so that the number of rows processed is low enough to avoid locks escalating to a table lock.
Try this.
1. Create the clustered index on the #temp table after populating it.
2. Don't explicitly drop the #temp table
3. Put the #temp table first in the FROM list and use a join hint to encourage a plan which begins with the #temp table joining next to your chosen table, like this:
FROM #temp
INNER loop JOIN BigTable
Note that this doesn't guarantee that #temp will be read first followed by BigTable, it's merely strong encouragement for the optimiser to use this path.
Experiment with the join hint (loop/hash/merge) to establish which type works fastest (minimum reads), and understand why. A nested loops join will benefit from a very specific index on BigTable, hash join is far less particular but will benefit from an index which reduces the number of rows which have to be scanned. If you want to force a merge join without a sort, you will require an index on BigTable which matches the join, and a clustered index on #temp which has the same keys in the same order.
Be wary of data caching, either clear the cache of clean data or take the fastest run of three.
Experiment with which table is joined to #temp.
Use OPTION(RECOMPILE) whilst experimenting. Any changes to the query will recompile it but changes to your temp table might not. You might not need OPTION(RECOMPILE) when you've finished experimenting, test this too.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
September 12, 2016 at 6:04 am
When you join two tables, the optimizer should know which one is the smaller. It doesn't have to follow the sequence in which the tables were listed in the FROM... JOIN... clauses.
Does it?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
September 12, 2016 at 6:20 am
The Dixie Flatline (9/12/2016)
When you join two tables, the optimizer should know which one is the smaller. It doesn't have to follow the sequence in which the tables were listed in the FROM... JOIN... clauses.Does it?
No, it doesn't have to follow the sequence - even if you employ a join hint like I suggest above. The join hint is just a strong recommendation for the optimiser to start with the #temp table then join to the hinted table. If your suggestion would result in a really rubbish plan, the optimiser will shift those two tables to elsewhere in the plan, but they will still be read in the specified order, and eventually joined by the specified hint - even if a third table comes in between.
The optimiser should know which table is the smaller but temp tables aren't as straightforward as folks might expect.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
September 12, 2016 at 6:22 am
Don't explicitly drop the #temp table
Hi Chris. Can you explain a little how not dropping the temp table affects things, please?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
September 12, 2016 at 6:27 am
Phil Parkin (9/12/2016)
Don't explicitly drop the #temp table
Hi Chris. Can you explain a little how not dropping the temp table affects things, please?
From memory Phil, if you explicitly drop the table, that's it. If you don't and the temp table is "implicitly" dropped by change of context such as stored procedure completion, then a reusable structure remains in tempdb.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
September 12, 2016 at 6:44 am
ChrisM@Work (9/12/2016)
Phil Parkin (9/12/2016)
Don't explicitly drop the #temp table
Hi Chris. Can you explain a little how not dropping the temp table affects things, please?
From memory Phil, if you explicitly drop the table, that's it. If you don't and the temp table is "implicitly" dropped by change of context such as stored procedure completion, then a reusable structure remains in tempdb.
Hmm, we need to do some research then. My understanding is that explicit drops no longer actually happen and they are still deferred-dropped.
Also note that stats on temp tables don't behave as we would expect either. Paul White did an awesome article on that (maybe two of them) some time ago that was rather mind-blowing.
http://sqlblog.com/blogs/paul_white/archive/2012/08/17/temporary-object-caching-explained.aspx
http://sqlblog.com/blogs/paul_white/archive/2012/08/15/temporary-tables-in-stored-procedures.aspx
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
September 12, 2016 at 6:57 am
TheSQLGuru (9/12/2016)
ChrisM@Work (9/12/2016)
Phil Parkin (9/12/2016)
Don't explicitly drop the #temp table
Hi Chris. Can you explain a little how not dropping the temp table affects things, please?
From memory Phil, if you explicitly drop the table, that's it. If you don't and the temp table is "implicitly" dropped by change of context such as stored procedure completion, then a reusable structure remains in tempdb.
Hmm, we need to do some research then. My understanding is that explicit drops no longer actually happen and they are still deferred-dropped.
Also note that stats on temp tables don't behave as we would expect either. Paul White did an awesome article on that (maybe two of them) some time ago that was rather mind-blowing.
http://sqlblog.com/blogs/paul_white/archive/2012/08/17/temporary-object-caching-explained.aspx
http://sqlblog.com/blogs/paul_white/archive/2012/08/15/temporary-tables-in-stored-procedures.aspx
Thanks for hopping in Kevin, I'm familiar with both of those articles but I'm only really familiar with the one describing wayward plan reuse (quoted above). Paul specifies a stats update on the temp table, we create the index after populating the table because for most of our processes, a clustered index helps performance (and we always measure).
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
September 15, 2016 at 2:02 am
TheSQLGuru (9/12/2016)
ChrisM@Work (9/12/2016)
Phil Parkin (9/12/2016)
Don't explicitly drop the #temp table
Hi Chris. Can you explain a little how not dropping the temp table affects things, please?
From memory Phil, if you explicitly drop the table, that's it. If you don't and the temp table is "implicitly" dropped by change of context such as stored procedure completion, then a reusable structure remains in tempdb.
Hmm, we need to do some research then. My understanding is that explicit drops no longer actually happen and they are still deferred-dropped.
Also note that stats on temp tables don't behave as we would expect either. Paul White did an awesome article on that (maybe two of them) some time ago that was rather mind-blowing.
http://sqlblog.com/blogs/paul_white/archive/2012/08/17/temporary-object-caching-explained.aspx
http://sqlblog.com/blogs/paul_white/archive/2012/08/15/temporary-tables-in-stored-procedures.aspx
From Kevin's first link: "Dropping a temporary table in a procedure does not count as DDL, and neither does TRUNCATE TABLE, nor UPDATE STATISTICS. None of these things prevent temporary table caching (so it does not matter whether you explicitly drop a temporary table at the end of a procedure or not)."
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply