does choosing indexed table col for where clause helps and does order of table matters?

Question

does choosing indexed table col for where clause helps and does order of table matters?

Viewing 5 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply

Eric M Russell SSC Guru Points: 125624 More actions · Answer 1

below86 (6/25/2014)
Eric M Russell (6/25/2014)
Fortunately the construction of the execution plan (for example which table is used as the outer or inner input on a nested loop or hash join) isn't dependent on how the developer specified the join order within the SQL statement. The query optimizer uses a cost based algorithm and takes indexes and table statistics into consideration to make that decision for us.
It may not matter the order the joins are in for the optimizer. But from a 'Best Practice' standpoint, I always try, when possible, to have my inner joins listed first. I think part of the initial question was what way should it be done.

I see your point, but that would be a difficult and subjective coding standard to follow. It's not the same as following coding standards for object names, tabs, and line breaks. The problem is that as we're writing a SQL query, we don't know which table will be chosen by the optimizer as the inner or outer input or even what join type will be used.

It's hubris for even the best SQL Server expert to think that he or she can consistently guess what type of query plan will be used simply by looking at the SQL statement. The query plan can also change on us over time, as the number of rows in a table increase or a new index added.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

below86 SSChampion Points: 11582 More actions · Answer 2

Sean Lange (6/25/2014)
below86 (6/25/2014)
But from a 'Best Practice' standpoint, I always try, when possible, to have my inner joins listed first.
How or why is this "best practice"? I create joins in the order that the data is logically associated regardless of the type of join.

I quess I should say it is a personal preference. I like to see all of the joins that are going to limit the data returned, inner joins, in one place, if it's possible based on the data being pulled. All other joins I agree with you Sean. Sorry I shouldn't have used the term 'Best Pratice', especially after the 'pet peeve' posts. 🙂

-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.

below86 SSChampion Points: 11582 More actions · Answer 3

Eric M Russell (6/25/2014)
below86 (6/25/2014)
Eric M Russell (6/25/2014)
Fortunately the construction of the execution plan (for example which table is used as the outer or inner input on a nested loop or hash join) isn't dependent on how the developer specified the join order within the SQL statement. The query optimizer uses a cost based algorithm and takes indexes and table statistics into consideration to make that decision for us.
It may not matter the order the joins are in for the optimizer. But from a 'Best Practice' standpoint, I always try, when possible, to have my inner joins listed first. I think part of the initial question was what way should it be done.
I see your point, but that would be a difficult and subjective coding standard to follow. It's not the same as following coding standards for object names, tabs, and line breaks. The problem is that as we're writing a SQL query, we don't know which table will be chosen by the optimizer as the inner or outer input or even what join type will be used.
It's hubris for even the best SQL Server expert to think that he or she can consistently guess what type of query plan will be used simply by looking at the SQL statement. The query plan can also change on us over time, as the number of rows in a table increase or a new index added.

And even versions of SQL will change it. We recently moved from 2008 to 2012 and we had a query that was running as part of an SSRS report that went from running in less than a minute to 45 minutes, then it died. I only heard about this from other team members, so I don't have specifics. But what they did was remove an index from one of the tables. Then it was back to running as it did before 2012.

-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.

Eric M Russell SSC Guru Points: 125624 More actions · Answer 4

below86 (6/25/2014)
Eric M Russell (6/25/2014)
below86 (6/25/2014)
Eric M Russell (6/25/2014)
Fortunately the construction of the execution plan (for example which table is used as the outer or inner input on a nested loop or hash join) isn't dependent on how the developer specified the join order within the SQL statement. The query optimizer uses a cost based algorithm and takes indexes and table statistics into consideration to make that decision for us.
It may not matter the order the joins are in for the optimizer. But from a 'Best Practice' standpoint, I always try, when possible, to have my inner joins listed first. I think part of the initial question was what way should it be done.
I see your point, but that would be a difficult and subjective coding standard to follow. It's not the same as following coding standards for object names, tabs, and line breaks. The problem is that as we're writing a SQL query, we don't know which table will be chosen by the optimizer as the inner or outer input or even what join type will be used.
It's hubris for even the best SQL Server expert to think that he or she can consistently guess what type of query plan will be used simply by looking at the SQL statement. The query plan can also change on us over time, as the number of rows in a table increase or a new index added.
And even versions of SQL will change it. We recently moved from 2008 to 2012 and we had a query that was running as part of an SSRS report that went from running in less than a minute to 45 minutes, then it died. I only heard about this from other team members, so I don't have specifics. But what they did was remove an index from one of the tables. Then it was back to running as it did before 2012.

A while back, I encountered a report that would run for more than 18 hours and then consistently fail after filling a 1 TB tempdb disk. It didn't do this all the time, just for a particularly large client that recently went live. Using an index hint, I changed the plan from a hash join to a merge join and it completed in less than 2 hours with only marginal tempdb usage. The index it needed was there; I just had to tell SQL Server to make use of it. It's rare, but I sometimes resort to using explicit index and join hints.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

below86 SSChampion Points: 11582 More actions · Answer 5

Eric M Russell (6/25/2014)
below86 (6/25/2014)
Eric M Russell (6/25/2014)
below86 (6/25/2014)
Eric M Russell (6/25/2014)
Fortunately the construction of the execution plan (for example which table is used as the outer or inner input on a nested loop or hash join) isn't dependent on how the developer specified the join order within the SQL statement. The query optimizer uses a cost based algorithm and takes indexes and table statistics into consideration to make that decision for us.
It may not matter the order the joins are in for the optimizer. But from a 'Best Practice' standpoint, I always try, when possible, to have my inner joins listed first. I think part of the initial question was what way should it be done.
I see your point, but that would be a difficult and subjective coding standard to follow. It's not the same as following coding standards for object names, tabs, and line breaks. The problem is that as we're writing a SQL query, we don't know which table will be chosen by the optimizer as the inner or outer input or even what join type will be used.
It's hubris for even the best SQL Server expert to think that he or she can consistently guess what type of query plan will be used simply by looking at the SQL statement. The query plan can also change on us over time, as the number of rows in a table increase or a new index added.
And even versions of SQL will change it. We recently moved from 2008 to 2012 and we had a query that was running as part of an SSRS report that went from running in less than a minute to 45 minutes, then it died. I only heard about this from other team members, so I don't have specifics. But what they did was remove an index from one of the tables. Then it was back to running as it did before 2012.
A while back, I encountered a report that would run for more than 18 hours and then consistently fail after filling a 1 TB tempdb disk. It didn't do this all the time, just for a particularly large client that recently went live. Using an index hint, I changed the plan from a hash join to a merge join and it completed in less than 2 hours with only marginal tempdb usage. The index it needed was there; I just had to tell SQL Server to make use of it. It's rare, but I sometimes resort to using explicit index and join hints.

I do think that they tried that, again I wasn't part of the 'fix', only told after the fact. I had originally created this report back on 2008, and I think I had added the index so that it would run in about a minute. It's funny that in 2008 it needed the index to run dast, but in 2012 it slowed it down. And removing the index from the table actually made it run faster in 2012.

-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.