Column Order in an Index

Question

Post reply

Column Order in an Index

Viewing 15 posts - 16 through 30 (of 49 total)

You must be logged in to reply to this topic. Login to reply

Wilfred van Dijk SSCrazy Eights Points: 9003 More actions · Answer 1

aha, the never-ending question about multicolumn or single column indexes!

Again, check Gail's excellent articles about indexes:

http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/

Wilfred
The best things in life are the simple things

cengland0 SSCertifiable Points: 6102 More actions · Answer 2

Wilfred van Dijk (11/23/2010)
aha, the never-ending question about multicolumn or single column indexes!
Again, check Gail's excellent articles about indexes:
http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/

Thanks for the link but I get "404 - File or directory not found."

Michael Ebaya Mr or Mrs. 500 Points: 564 More actions · Answer 3

cengland0 (11/23/2010)
Please don't laugh at me for asking this question.
Why would you create a compound index with both last name and first name together? Why not two separate indexes where one is for last name and the other is for first name. That should handle any combination that you have in your where clause.

Because indexes don't work the way you think they do. Indexes sort off their key expression. If that expression contains two columns, the table is sorted by both. If it contains one column only, then its only sorted by that single column, period.

Without getting into too much technical detail, SQL contains an optimization called index intersection that allows two indexes to act a wee bit like one compound index. But "a wee bit" is the operative term here; the performance is not the same.

To use your firstname, lastname example. If you have a single "John Smith" in your table, a compound index will walk the tree right to that record-- all done! No scanning required. Two indexes, one on each column, will only allow the engine to find all Smiths, then find all Johns, then laboriously intersect the result result to find which ones match both criteria.

Performance takes a further hit when you want to order by multiple columns, as a single index will return the rows in proper order, whereas results from intersected indexes must be manually sorted each time.

Wilfred van Dijk SSCrazy Eights Points: 9003 More actions · Answer 4

seems SSC adds something in front of it, copy/paste: http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/

Wilfred
The best things in life are the simple things

David Walker-278941 Ten Centuries Points: 1064 More actions · Answer 5

"This does not mean that the index 'NonClust_Contact_Name' was not useful. It's still much better than the Clustered scan."

*WHAT* is still much better than the clustered scan? I'm confused by that last sentence.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 6

cengland0 (11/23/2010)
Wilfred van Dijk (11/23/2010)
aha, the never-ending question about multicolumn or single column indexes!
Again, check Gail's excellent articles about indexes:
http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/
Thanks for the link but I get "404 - File or directory not found."

Wilfred messed up the forum coding (look at the url, there's sqlservercentral at the front probably due to the new line character.)

http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 7

Franky Leeuwerck (11/23/2010)
Previous link can be found here :
http://sqlserverpedia.com/wiki/Index_Selectivity_and_Column_Order

The wiki article was adapted (with my permission) from my two blog posts on index column order, the one on equality that Wilfred posted and one on inequalities

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

cengland0 SSCertifiable Points: 6102 More actions · Answer 8

Michael Ebaya (11/23/2010)
cengland0 (11/23/2010)
Please don't laugh at me for asking this question.
Why would you create a compound index with both last name and first name together? Why not two separate indexes where one is for last name and the other is for first name. That should handle any combination that you have in your where clause.
Because indexes don't work the way you think they do. Indexes sort off their key expression. If that expression contains two columns, the table is sorted by both. If it contains one column only, then its only sorted by that single column, period.
Without getting into too much technical detail, SQL contains an optimization called index intersection that allows two indexes to act a wee bit like one compound index. But "a wee bit" is the operative term here; the performance is not the same.
To use your firstname, lastname example. If you have a single "John Smith" in your table, a compound index will walk the tree right to that record-- all done! No scanning required. Two indexes, one on each column, will only allow the engine to find all Smiths, then find all Johns, then laboriously intersect the result result to find which ones match both criteria.
Performance takes a further hit when you want to order by multiple columns, as a single index will return the rows in proper order, whereas results from intersected indexes must be manually sorted each time.

Thanks, that was an excellent explanation.

I can already see some tables where I can improve the indexes with this strategy. I have a frequently used 12 million row table that has a 16 digit number separated into two columns with 8 digits each. The developers always put both columns in their where clause and would never select on just one of them. I'm going to implement this improvement right away.

Oh, please don't ask about why the two columns are separate. I would never have designed it this way but it comes from an application that pushes the data to us each day and their SSIS package fails if we modify any of the columns. I'm even embarrassed to say that these both are VARCHAR(500) columns even knowing that a maximum of 8 numeric characters will fit. They even drop leading zeros when pushing the data to us.

cengland0 SSCertifiable Points: 6102 More actions · Answer 9

GilaMonster (11/23/2010)
cengland0 (11/23/2010)
Wilfred van Dijk (11/23/2010)
aha, the never-ending question about multicolumn or single column indexes!
Again, check Gail's excellent articles about indexes:
http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/
Thanks for the link but I get "404 - File or directory not found."
Wilfred messed up the forum coding (look at the url, there's sqlservercentral at the front probably due to the new line character.)
http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/

Great Article. I bookmarked that one as a keeper.

cengland0 SSCertifiable Points: 6102 More actions · Answer 10

Doh, I just discovered the maximum key length is 900 bytes so combining two VARCHAR(500) columns into one index results in a warning:

The maximum key length is 900 bytes. The ***indexname*** has maximum length of 1000 bytes. For some combinations of large values, the insert/update operation will fail.

David Walker-278941 Ten Centuries Points: 1064 More actions · Answer 11

cengland0 (11/23/2010)
Doh, I just discovered the maximum key length is 900 bytes so combining two VARCHAR(500) columns into one index results in a warning:
The maximum key length is 900 bytes. The ***indexname*** has maximum length of 1000 bytes. For some combinations of large values, the insert/update operation will fail.

Surely they will allow you to redefine the columns as Char(8) or Varchar(8), won't they? You might get a warning that some data MIGHT be truncated, but if the data is all 8 characters or less, you can ignore this warning. THEN you can index the columns.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 12

cengland0 (11/23/2010)
Doh, I just discovered the maximum key length is 900 bytes so combining two VARCHAR(500) columns into one index results in a warning:
The maximum key length is 900 bytes. The ***indexname*** has maximum length of 1000 bytes. For some combinations of large values, the insert/update operation will fail.

If you *know* there's only ever 8 characters in each column, you can ignore that warning. However if one day there are over 450 characters in each.....

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

David Walker-278941 Ten Centuries Points: 1064 More actions · Answer 13

GilaMonster (11/23/2010)
cengland0 (11/23/2010)
Doh, I just discovered the maximum key length is 900 bytes so combining two VARCHAR(500) columns into one index results in a warning:
The maximum key length is 900 bytes. The ***indexname*** has maximum length of 1000 bytes. For some combinations of large values, the insert/update operation will fail.
If you *know* there's only ever 8 characters in each column, you can ignore that warning. However if one day there are over 450 characters in each.....

If you change the columns to Char(8) now, there won't ever be 450 characters in the column!

If the vendor sending the data says that the data has 8 digits, you can go with that. You can always update it to [Var]char(12) or [Var]char(20) or whatever, in the future, if the vendor starts sending a larger field. Or set them to Varchar(20) now and then index the two columns, and say that you're planning for future growth of the field.

cengland0 SSCertifiable Points: 6102 More actions · Answer 14

David Walker-278941 (11/23/2010)
If you change the columns to Char(8) now, there won't ever be 450 characters in the column!
If the vendor sending the data says that the data has 8 digits, you can go with that. You can always update it to [Var]char(12) or [Var]char(20) or whatever, in the future, if the vendor starts sending a larger field. Or set them to Varchar(20) now and then index the two columns, and say that you're planning for future growth of the field.

We thought of that but we have no control over the SSIS package that the vendor uses. I think it is set to fail on any warnings. As you know, if you try to insert a VARCHAR(500) from the source table into a CHAR(8), you get a truncation possible warning. That causes the package to fail and we receive no data.

As GilaMonster stated, I'm certain there are a maximum of 8 digits in each of the two columns so I should be okay.

taylor_benjamin SSC Veteran Points: 286 More actions · Answer 15

Well done.

One way to get SQL Server to consider an index it would otherwise exclude in preference to a table scan is to provide criteria for all columns of the index.

In the case of your article, the optimizer was intelligent enough to know that the existing index was more optimized than a table scan. When this doesn't work, simply specify not restricting criteria in place of missing attributes.

With your example, included LastName > 'A' in your where clause will meet the qualifications for the index.

It's an old school trick, but it still works.

The neat part is that you can do this in any missing column of a composite index

Write more...

Cheers,

Ben