September 28, 2011 at 10:08 am
Hi all!
I'm working on wrapping my mind around indexes (reading the wonderful Stairway series), and I've got a question: if I'm using a temp table that will only be written to once, and I want to index it, and it's being JOINed to by three other tables on different keys, would I want ONE index, or THREE?
The reason I ask is, if field order is important in indexes, whichever field I put first will throw the other two out of order, so the index will only help the JOIN based on that first field, and not the other two JOINs... right? It sounds like I would want three different index in this case, with each index starting with the field I'm joining to a different table on, and then adding any other fields I want to be covered.
Thanks in advance!
~Zeb
September 28, 2011 at 10:15 am
You're right. If you have three very different queries (pulling off different columns completely), then you should usually have 3 separate covering indexes to support them.
As with all things though, this depends on the temp table being used, the data inside it, data types, how many columns each query pulls back, how many rows you're expecting back from each, etc...you will need to try it both ways.
September 28, 2011 at 10:24 am
Awesome, thank you! It sounded logical, but I wanted to make sure I wasn't missing something.
Thank you very much!
September 28, 2011 at 10:26 am
Do you have 3 queries that each filter/join on one column, or do you have one query that filters/joins on three columns?
For the former, three indexes, for the latter, one index with 3 columns.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 28, 2011 at 10:28 am
Heh, I was just looking at that!
This is part of my query:
FROM #EBuyers AS B
INNER JOIN #EVendors AS V
ON V.BuyerNumber = B.BuyerNumber
INNER JOIN tblProduct AS P WITH (NOLOCK)
ON P.VendorNumber = V.VendorNumber
INNER JOIN tblWarehouseProduct AS WP WITH (NOLOCK)
ON WP.ProductNumber = P.ProductNumber
AND WP.WarehouseNumber = V.WarehouseNumber
AND WP.ProductAvailabilityFlag = 'A'
You can see that #EVendors joins different tables on BuyerNumber, VendorNumber, and WarehouseNumber. So you're saying ONE index? Why?
September 28, 2011 at 10:35 am
One index. Because SQL is not going to use three different structures and then have to patch them together unless it really has no other choice. It can do index intersections, they're not cheap and SQL may just ignore them and scan the table.
You want one index with (BuyerNumber, VendorNumber, WarehouseNumber). Which order, that you'll have to test. Or create multiple indexes with all three columns in different orders to test which one SQL uses. Just to test, not permanent .
p.s. Do you know what nolock does? It's not a 'run faster' switch...
See - http://sqlblog.com/blogs/andrew_kelly/archive/2009/04/10/how-dirty-are-your-reads.aspx
p.p.s. Filters belong in the WHERE clause, not the FROM clause. (unless doing outer joins and needing the filter before the join)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 28, 2011 at 10:37 am
That would likely be best handled by one index unless you have some really odd where clauses or selects:
CREATE NONCLUSTERED INDEX idxEVendors_Z1 ON #EVendors
(BuyerNumber, VendorNumber, WarehouseNumber)
INCLUDE (Whatever COLUMNS ARE IN your SELECT)
Something along those lines.
September 28, 2011 at 10:45 am
GilaMonster (9/28/2011)
One index. Because SQL is not going to use three different structures and then have to patch them together unless it really has no other choice. It can do index intersections, they're not cheap and SQL may just ignore them and scan the table.You want one index with (BuyerNumber, VendorNumber, WarehouseNumber). Which order, that you'll have to test. Or create multiple indexes with all three columns in different orders to test which one SQL uses. Just to test, not permanent .
p.s. Do you know what nolock does? It's not a 'run faster' switch...
See - http://sqlblog.com/blogs/andrew_kelly/archive/2009/04/10/how-dirty-are-your-reads.aspx
p.p.s. Filters belong in the WHERE clause, not the FROM clause. (unless doing outer joins and needing the filter before the join)
OK, so figure out which field will give the most bang for the buck (or try all three and let SQL figure it out via the execution plan), and use that for this type of joining. Gotcha.
These are datamart tables, loaded once per day, so it's all about not blocking, and I'm not worried about a dirty read
On the filters in the JOIN... dare I ask, does it matter? What is the impact??
(I LOVE getting to ask questions and get answers! Thank you!)
October 4, 2011 at 10:13 pm
This is a question for Gail:
p.p.s. Filters belong in the WHERE clause, not the FROM clause. (unless doing outer joins and needing the filter before the join)
I was always instructed to place as most "criteria" in the JOINS rather than in the WHERE clause because I was told it would filter out unwanted overhead from the get-go). Until now, I guess I've always just took it as "they know better" and I've never actually bothered to ask the question or reseach it...
Shouldn't you always want to filter out as much data as possible in multiple table JOINs before even hitting the WHERE clause? Does the JOIN impact which query plan will be chosen as much as the fields placed in the WHERE clause?
Thanks in advance for clarifying.
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
October 5, 2011 at 1:02 am
MyDoggieJessie (10/4/2011)
Shouldn't you always want to filter out as much data as possible in multiple table JOINs before even hitting the WHERE clause?
You're assuming the join clause runs before the where. That assumption is wrong.
Does the JOIN impact which query plan will be chosen as much as the fields placed in the WHERE clause?
The optimiser is smart. The following 2 queries will have exactly the same execution plan (exactly the same in every way)
SELECT <stuff>
FROM tbl1 inner join tbl2 on tbl1.ID = tbl2.SomeID
WHERE tbl2.AnotherColumn = 'SomeValue'
SELECT <stuff>
FROM tbl1 inner join tbl2 on tbl1.ID = tbl2.SomeID AND tbl2.AnotherColumn = 'SomeValue'
If I changed that join to LEFT, those 2 queries would have completely different meanings.
It's semantics and convention. The Join clause is for joins and the where clause (and to a lesser extent Having) are for filters. The only time a filter should be in the join is when it's an outer join and the outer join is to a subset of a table.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply