Index Usage Question

Question

Index Usage Question

AndrewJacksonZA

SSCarpal Tunnel

Points: 4143
More actions
June 25, 2009 at 2:05 am

#211263

Hi
Something interesting, to me anyway, that I've noticed about index usage that I would like to know more about. When I do an initial SELECT, it scans the clustered primary key. When I create both of the new indices and do either of the SELECTs, the db engine uses the index that I created last. Why?
Code to reproduce:
CREATE TABLE dbo.Tmp_bob
(
pkColA varchar(15) NOT NULL,
otherCol float(53) NULL,
pkColB varchar(3) NOT NULL,
otherCols datetime NULL
)
GO
ALTER TABLE dbo.Tmp_bob ADD CONSTRAINT
PK_Tmp_bob PRIMARY KEY CLUSTERED
(
pkColA,
pkColB
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
SELECT pkColA FROM Tmp_bob
-- It scans the clustered primay key.
CREATE NONCLUSTERED INDEX IX_Tmp_bob_pkColA ON Tmp_bob(pkColA) WITH PAD_INDEX, FILLFACTOR = 80
CREATE NONCLUSTERED INDEX IX_Tmp_bob_pkColB ON Tmp_bob(pkColB) WITH PAD_INDEX, FILLFACTOR = 80
SELECT pkColA FROM Tmp_bob
-- It scans IX_Tmp_bob_pkColB
DROP INDEX Tmp_bob.IX_Tmp_bob_pkColA
DROP INDEX Tmp_bob.IX_Tmp_bob_pkColB
CREATE NONCLUSTERED INDEX IX_Tmp_bob_pkColB ON Tmp_bob(pkColB) WITH PAD_INDEX, FILLFACTOR = 80
CREATE NONCLUSTERED INDEX IX_Tmp_bob_pkColA ON Tmp_bob(pkColA) WITH PAD_INDEX, FILLFACTOR = 80
SELECT pkColA FROM Tmp_bob
-- It scans IX_Tmp_bob_pkColA
Thanks.

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Because the last nonclustered index that you added is covering (contains all the columns that the query needs) and is smaller than the cluster (because the cluster is the table). Hence scanning that NC index is faster and requires less IOs than scanning the cluster.

In fact, both of the nonclustered indexes are covering, since a nonclustered index includes the clustering key. The two indexes are, for that query, equivalent and hence SQL can use either.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

hi_abhay78 SSCertifiable Points: 6265 More actions · Answer 2

Yes you are right .A bit strange (to us that why the index that is created second is used)..but there might be some logic we dont know .

main thing is , will it perform the same way once some good amount of data is in inside the table .

I have not done any stuff to populate this table .

Might try , but very occupied as of now .

Regards

Abhay Chaudhary
Sr.DBA (MCITP/MCTS :SQL Server 2005/2008 ,OCP 9i)

Tim Walker. SSCertifiable Points: 5173 More actions · Answer 3

It's also worth noting that if there is a choice of covering indexes that can be used to return the results, the one occupying the smallest number of pages is chosen.

You can test this on any table with a clustered primary key and a few other indexes of varying sizes thus:

SELECT

ColumnAtTopOfPrimaryKey

FROM

TheTable

I wouldn't recommend testing behaviour on empty tables either, because I think it can produce misleading results.

Tim

.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

Tim Walker (6/25/2009)
I wouldn't recommend testing behaviour on empty tables either, because I think it can produce misleading results.

Indeed. Reading 0 pages is pretty quick no matter what index.

Load the table up with a million or so rows and, provided that the length of the strings in those varchars are close to the declared value, I suspect that the index with B as the key col will most likely be favoured. It's close and the difference is unlikely to appear on small-mid sized tables

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

AndrewJacksonZA SSCarpal Tunnel Points: 4143 More actions · Answer 5

I just tested with both ten thousand and a million rows. The results are different now - only IX_Tmp_bob_pkColB is used, independent of index creation order. If I DROP IX_Tmp_bob_pkColB then IX_Tmp_bob_pkColA is used.

Why would it use IX_Tmp_bob_pkColB if I haven't included any reference/data/information about pkColA in it? ("RTFM on covering indices!"?)

Tim Walker. SSCertifiable Points: 5173 More actions · Answer 6

You have included it - you just didn't realise!

Where you have a unique clustered index or primary key an index uses this to point to the data. Your example makes it a bit tricky to see this because you are using your columns for both the PK and the example indexes.

A clearer example:

Table contains ColA, ColB, ColC, ColD, ColE

Clustered PK (or unique clustered index) on ColA, ColB

Secondary Index on ColC

What the secondary index actually holds is the relationship between ColC and ColA, ColB.

So it is a covering index!

Tim

.

Tim Walker. SSCertifiable Points: 5173 More actions · Answer 7

Additionally I am guessing it uses the index based on the varchar(3) column because it occupies less pages than the varchar(15) one.

🙂

Tim

.

AndrewJacksonZA SSCarpal Tunnel Points: 4143 More actions · Answer 8

Thanks for the index relationship info Tim. I also thought that it would prefer to lookup on a smaller index.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

Tim Walker (6/25/2009)
Additionally I am guessing it uses the index based on the varchar(3) column because it occupies less pages than the varchar(15) one.

Depending, of course, on the size of the string stored. If all the strings in the varchar(3) are exactly 3 chars long and all the strings in the varchar(15) are either 2 or 3 characters long, then the index on the varchar(15) will be smaller.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

GDI Lord (6/25/2009)
IWhy would it use IX_Tmp_bob_pkColB if I haven't included any reference/data/information about pkColA in it? ("RTFM on covering indices!"?)

As I mentioned earlier, the clustering key is always part of all nonclustered indexes, whether you specify them of not. It has to be, it's the 'address' of the row. Hence both pk_ColA and pk_ColB will be part of every single nonclustered index.

That, incidentally, is why one of the recommendations for a clustered index key is 'narrow'.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Tim Walker. SSCertifiable Points: 5173 More actions · Answer 11

GilaMonster (6/25/2009)
Tim Walker (6/25/2009)
Additionally I am guessing it uses the index based on the varchar(3) column because it occupies less pages than the varchar(15) one.
Depending, of course, on the size of the string stored. If all the strings in the varchar(3) are exactly 3 chars long and all the strings in the varchar(15) are either 2 or 3 characters long, then the index on the varchar(15) will be smaller.

Absolutely right!

Hey Gail, do you know what would happen if one index was narrower but fragmented so actually occupied more pages than a wider one? Which would be chosen then?

Tim

.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 12

Tim Walker (6/25/2009)
Hey Gail, do you know what would happen if one index was narrower but fragmented so actually occupied more pages than a wider one? Which would be chosen then?

Well, fragmented doesn't necessarily mean more pages, I can have an index that's 99% fragmented but all the pages are close to 100% full.

Assuming you mean one of the indexes has pages only half full. Haven't tested, don't know for sure, but I suspect it wouldn't factor into the decision. I don't believe that the optimiser knows how fragmented an index is at the time that it compiles the query. I suspect it will base that solely on the statistics of the index which includes the average length of the columns.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Tim Walker. SSCertifiable Points: 5173 More actions · Answer 13

GilaMonster (6/25/2009)
Tim Walker (6/25/2009)
Hey Gail, do you know what would happen if one index was narrower but fragmented so actually occupied more pages than a wider one? Which would be chosen then?
Well, fragmented doesn't necessarily mean more pages, I can have an index that's 99% fragmented but all the pages are close to 100% full.
Assuming you mean one of the indexes has pages only half full. Haven't tested, don't know for sure, but I suspect it wouldn't factor into the decision. I don't believe that the optimizer knows how fragmented an index is at the time that it compiles the query. I suspect it will base that solely on the statistics of the index which includes the average length of the columns.

Sorry, very sloppy terminology from me. Yes I did mean one that was sparse, e.g. though the original fill factor, page splits, deletions, or whatever!

I thought about it a bit after I asked the question and I think you must be right. I might test it sometime to find out. Specifying the fill factor to make an index bigger should work I think. I'm sure the optimizer won't consider the original fill factor as it isn't maintained over time.

Thanks for the reply.

Tim

.

AndrewJacksonZA SSCarpal Tunnel Points: 4143 More actions · Answer 14

AndrewJacksonZA

SSCarpal Tunnel

Points: 4143

June 26, 2009 at 8:11 am

#1016050

Thanks Gila!