November 27, 2014 at 4:38 am
I have 2 tables,
1. Need
2. NeedCategory
Need table has clusted index on needid column and NeedCategory have composite clustered index on needid and categoryid.
Now take a look on following query and execution plan for the query.
SELECT N.NeedId,N.NeedName,N.ProviderName
FROM dbo.Need N
JOIN dbo.NeedCategory NC
ON nc.NeedId = n.NeedId
WHERE IsActive=1
AND CategoryId= 2
ORDER BY NeedName
* Clustered index scan on need table is happens for Isactive= 1.
* Clustered index scan on needcategory table is happens for CategoryId=2
My question is,
1. Why scan happens before the join occurs? if it happens after join then the filter would be lighter. Even if optimizer chooses the scan to execute first.
2. Is there any chance to rearrange the execution plan manually?
Thanks In Advance
November 27, 2014 at 5:02 am
You've probably got no useful indexes for that query, so in the absence of indexes to support the WHERE and JOIN, SQL has to scan the table. The optimiser would have calculated the cheapest plan and this was it.
In this case, both tables were scanned once. The merge join is efficient when the two resultsets are in the order of the join column. The plan you're asking for (nested loop join, scan one, use that to access the other) would have resulted in the Need table being scanned multiple times, which is not efficient.
You can make some changes to the plan with query hints. Unless you know exactly why the plan you got is bad, exactly what the hints are going to do and are absolutely, 100% certain that you know better than the query optimiser, stay away from such hints.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 27, 2014 at 5:05 am
You can' do an index seek on Need because Active is not in the index.
The query optimizer has determined that the most effective plan is to do an index scan and merge join, rather than an index seek and rowid lookup. The query optimizer uses the statistics to estimate the plan. If you think is is wrong you could regenerate the statistics.
If this is a simplified query to post the question, the real query may be a victim of 'parameter sniffing' where the cached plan was based upon some criteria and the actual query is significantly different.
For example, in you above example, if the plan was cached using an ID that returned 800 records from a 1000 record table, the optimised plan would be different than for a query that returned 3 records from the same 1000 record table. If the real life query takes parameters and the data distribution is not equal, then it is not possible to generate a plan that is optimal for all possible values.
November 27, 2014 at 5:07 am
vignesh.ms (11/27/2014)
...1. Why scan happens before the join occurs? if it happens after join then the filter would be lighter. Even if optimizer chooses the scan to execute first....
You have to read the data before you can do anything with it. You can't join first because you have nothing to join.
Grant Fritchey has an excellent eBook "SQL Server Execution Plans" which will answer many of your questions.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
November 27, 2014 at 6:06 am
SELECT N.NeedId,N.NeedName,N.ProviderName
FROM dbo.Need N
JOIN dbo.NeedCategory NC
ON nc.NeedId = n.NeedId
For this query also index scan occurs for the lead table, even though need has CI on need id column
http://www.sqlservercentral.com/Forums/Attachment16536.aspx%5B/img%5D
Why it chooses to go with the Index scan ?
is there any thing I required to configure ?
Suggest me any fine tuning tuts...
November 27, 2014 at 6:10 am
Same answer as I gave you before.
In this case, both tables were scanned once. The merge join is efficient when the two resultsets are in the order of the join column. The plan you're asking for (nested loop join, scan one, use that to access the other) would have resulted in the Need table being read multiple times, which is not efficient.
Which sounds efficient:
Read the Need table once (current plan), this means reading each page once and only once
Read the Need table once for each row in NeedCategory (what you're asking for), each one reading 2-3 pages and returning one row
?
As for books, the one Chris recommended.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 27, 2014 at 10:15 am
You need to have indexes in place that will give the optimizer something to work with. As it is, the only option it has is to scan the tables.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply