Duplicate Index Performance

Question

Duplicate Index Performance

upstart

Ten Centuries

Points: 1255
More actions
October 27, 2010 at 9:48 am

#230469

Today, out of curiosity, I decided to check our databases for duplicate Indexes. I found several tables on each dB that had duplicates and was wondering how much of a performance hit is caused by each. I know I should investigate further and remove them, as they provide no advantages and only help bring down performance, but I had a few questions before I did that.
-- What are the performance hits incurred by having duplicate Non-Clustered indexes on a table? Is there any easy way to measure this?
-- Would you ever want to have a Non-Clustered and Clustered Index that are the same? For some reason, I inherited several tables that have NC and C indexes that only reference one column.
As of right now, our performance isn't bad by any means, but I am starting to look for ways to fine-tune everything to get it all running even smoother. As of right now, I see no reason to keep these indexes and have the following reasons why they should be deleted"
-- Duplicate Indexes take up physical space
-- Performance hits on simple table commands (Update, Insert, etc..)
-- Maintaining dB takes longer (backup, re-index, etc.)
-- T-Log fills up faster
Just wanted to know if I was thinking along the right-track of if I should just leave well enough alone.
Thanks in advance,

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

upstart (10/27/2010)
-- What are the performance hits incurred by having duplicate Non-Clustered indexes on a table? Is there any easy way to measure this?

Ur... It depends.

Set up a test environment, take stats with the dups, take stats without

-- Would you ever want to have a Non-Clustered and Clustered Index that are the same? For some reason, I inherited several tables that have NC and C indexes that only reference one column.

There are cases where you might. They aren't common. Point is that the cluster is the largest index on the table (because it includes all columns). It can, maybe, in certain cases be advantageous to also have narrow index on that column. Not common, probably more an edge case with very specific performance requirements

As of right now, I see no reason to keep these indexes and have the following reasons why they should be deleted"
-- Duplicate Indexes take up physical space
-- Performance hits on simple table commands (Update, Insert, etc..)
-- Maintaining dB takes longer (backup, re-index, etc.)
-- T-Log fills up faster

Also backups are bigger, more space required for them.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 2

GilaMonster (10/27/2010)
Point is that the cluster is the largest index on the table (because it includes all columns).

Always ? What if i include only one column like empid in EMPLOYEE table ?

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Bhuvnesh (10/28/2010)
GilaMonster (10/27/2010)
Point is that the cluster is the largest index on the table (because it includes all columns).
Always ? What if i include only one column like empid in EMPLOYEE table ?

Please go and revise what a clustered index is. Also note that you cannot manually include a column in a clustered index. It should be obvious why.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 4

@gail, Here i didnt mean "inlcude column" i meant to say , if i have clustered index on one column "Empid" . then how it would consider other non-participated columns.?

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

Hint: What makes a clustered index different from a nonclustered index?

Yes, it may be possible with work to make a nonclustered index larger than the cluster, it's not something that you'll find in the vast majority of cases, it'll take work.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Wildcat SSCarpal Tunnel Points: 4774 More actions · Answer 6

GilaMonster (10/28/2010)
Hint: What makes a clustered index different from a nonclustered index?
Yes, it may be possible with work to make a nonclustered index larger than the cluster, it's not something that you'll find in the vast majority of cases, it'll take work.

Curiously, Gail, how to make the nonclustered index larger than the clustered index? :unsure:

GSquared SSC Guru Points: 260824 More actions · Answer 7

Wildcat (11/11/2010)
GilaMonster (10/28/2010)
Hint: What makes a clustered index different from a nonclustered index?
Yes, it may be possible with work to make a nonclustered index larger than the cluster, it's not something that you'll find in the vast majority of cases, it'll take work.
Curiously, Gail, how to make the nonclustered index larger than the clustered index? :unsure:

Index non-persisted calculated columns, where the calculation is smaller than the result.

For example:

IF OBJECT_ID(N'dbo.IndexTest', 'u') IS NOT NULL

DROP TABLE dbo.IndexTest;

GO

CREATE TABLE dbo.IndexTest (

ID INT IDENTITY PRIMARY KEY CLUSTERED,

ColA TINYINT,

ColB INT,

ColC AS ColA + ColB);

GO

CREATE INDEX IDX_CBA ON dbo.IndexTest (ColC, ColB, ColA, ID);

GO

INSERT INTO dbo.IndexTest (ColA, ColB)

SELECT N1.Number, N2.Number

FROM DBA.dbo.Numbers AS N1

CROSS JOIN DBA.dbo.Numbers AS N2

WHERE N1.Number <= 255

AND N2.Number <= 100;

When I ran that, the clustered index ended up taking 57 pages, while the non-clustered took 60. Bigger examples should be easy enough to come by in bigger tables.

Another way is to have a multi-column clustered index, and to include those same columns in a non-clustered index. Since the clustered index is automatically included, duplicating the columns can multiply the storage space.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

GSquared (11/11/2010)
Another way is to have a multi-column clustered index, and to include those same columns in a non-clustered index. Since the clustered index is automatically included, duplicating the columns can multiply the storage space.

SQL's smarter than that. If you put the clustered index key into an index, SQL won't put it there a second time. I think I need a blog post on this. Common misconception.

The way I was thinking was to have a small clustering key then a nonclustered index on a really large set of columns (touching the 900 bytes limit) and include every single other column in the table.

Leaf level should be about the same size as the cluster (because both have all columns), but the non-leaf levels will be larger (wider key) resulting in a nonclustered index larger than the cluster. Does take work though

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

Proof:

CREATE TABLE TestingIndexes (

ID int identity not null,

SomeCol Char(10),

RandDate datetime,

Filler char(200)

)

create unique clustered index idx_Testing_ID on TestingIndexes (ID, SomeCol)

create nonclustered index idx_Testing_Wide on TestingIndexes (RandDate, SomeCol, ID) -- entire cluster in key

insert into TestingIndexes (SomeCol, RandDate, Filler)

select left(a.name,10),b.create_date, ''

from sys.columns a cross join sys.objects b

go

-- investigation

select db_id(), object_id('TestingIndexes') -- use results for following

dbcc ind(11,103007448,1)

dbcc ind(11,103007448,2)

-- want a leaf page from both (so index level 0, page type 1 in the cluster, page type 2 in the nonclustered)

dbcc traceon (3604)

dbcc page (11,1,135252,3) -- clustered index

dbcc page (11,1,138050,3) -- nonclustered index

dbcc traceoff (3604)

Portion of output of the dbcc page of the cluster:

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

ID = 176

Slot 0 Column 2 Offset 0x8 Length 10 Length (physical) 10

SomeCol = fromsvc

Slot 0 Column 3 Offset 0x12 Length 8 Length (physical) 8

RandDate = 2008-08-27 10:13:36.723

Slot 0 Column 4 Offset 0x1a Length 200 Length (physical) 200

Filler =

And the noncluster:

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

homebrew01 SSC Guru Points: 55548 More actions · Answer 10

It's possible the indexes you want to delete are hard-coded as index hints. I use something like this to look for character strings in stored procedures

exec sp_MSforeachdb_db 'use [?] select ''?'' as dbName, name as ''SP_Name'',

substring(definition,CHARINDEX(''IndexName'', definition)-25,150) as ''Code_Snippet''

from sys.sql_modules com

join sysobjects obj on com.object_id = obj.id

where (definition like ''IndexName'')

order by dbName, name'

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 11

Now let's see about that large index...

Setup:

CREATE TABLE TestingIndexes (

ID int identity not null primary key,

SomeCol Char(8) not null,

RandDate datetime not null,

Filler1 char(880) not null,

Filler2 char(500)

)

create nonclustered index idx_Testing_Wide

on TestingIndexes (RandDate, SomeCol, ID, Filler1) -- and that makes 900 bytes exactly

include (Filler2)

insert into TestingIndexes (SomeCol, RandDate, Filler1, Filler2)

select left(a.name,8),b.create_date, ' ',''

from sys.columns a cross join sys.objects b

go

And to test index size:

select db_id(), object_id('TestingIndexes') -- use for following

select * from sys.dm_db_index_physical_stats(11,199007790,null, null, 'detailed')

The cluster had 19071 index pages - 19001 leaf, 70 at level 1 and 1 root page. A 3 level deep index

The noncluster had 25330 index pages - 19001 pages (same as the cluster, as would be expected), 4750 at level 1, 1187 at level 2, 296 at level 3, ... and finally the root page at level 7. An 8 level deep index!

As I said, this takes work. Having a noncluster with a key wider than the cluster is fairly normal. Having one that also includes every single other column in the table is generally just concerning.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

GSquared SSC Guru Points: 260824 More actions · Answer 12

GilaMonster (11/11/2010)
GSquared (11/11/2010)
Another way is to have a multi-column clustered index, and to include those same columns in a non-clustered index. Since the clustered index is automatically included, duplicating the columns can multiply the storage space.
SQL's smarter than that. If you put the clustered index key into an index, SQL won't put it there a second time. I think I need a blog post on this. Common misconception.
The way I was thinking was to have a small clustering key then a nonclustered index on a really large set of columns (touching the 900 bytes limit) and include every single other column in the table.
Leaf level should be about the same size as the cluster (because both have all columns), but the non-leaf levels will be larger (wider key) resulting in a nonclustered index larger than the cluster. Does take work though

SQL's smarter than that if the second index has the clustered index columns in the same order as they are in the clustered index, and adjacent to each other. Have you tried a clustered index on A, B, C, and a non-clustered on D, A, E, C, F, B?

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 13

GSquared (11/11/2010)
SQL's smarter than that if the second index has the clustered index columns in the same order as they are in the clustered index, and adjacent to each other.

Not sure I follow you...

Have you tried a clustered index on A, B, C, and a non-clustered on D, A, E, C, F, B?

Create Table TestingIndex2 (

A int,

B int,

C int,

D int,

E int,

F int

)

CREATE UNIQUE CLUSTERED INDEX idx_1 on TestingIndex2 (A, B, C)

CREATE NONCLUSTERED INDEX idx_2 on TestingIndex2 (D, A, E, C, F, B)

INSERT INTO TestingIndex2

SELECT Row_Number() Over (Order by a.name) a,

Row_Number() Over (Order by b.name) b,

Row_Number() Over (Order by a.object_id) c,

Row_Number() Over (Order by a.type) d,

Row_Number() Over (Order by a.name) e,

Row_Number() Over (Order by b.name) f

from sys.objects a cross join sys.objects b

-- (24649 row(s) affected)

GO

select object_id('TestingIndex2')

dbcc ind (11,295008132,1)

dbcc ind (11,295008132,2)

dbcc traceon (3604)

dbcc page (11,1,135279,3)

dbcc traceoff (3604)

102 index pages in the clustered index (plus 1 IAM)

93 index pages in the nonclustered index (plus 1 IAM)

Nonclustered leaf page has the 6 index columns in the specified order - D, A, E, C, F, B

A second nonclustered index with the clustering key adjacent and in order specified in clustering key shows same behaviour

CREATE NONCLUSTERED INDEX idx_3 on TestingIndex2 (D, E, A, B, C, F)

dbcc ind (11,295008132,3)

dbcc traceon (3604)

dbcc page (11,1,249185,3)

dbcc traceoff (3604)

Again 93 index pages + 1 IAM and all 6 columns listed in leaf page, in specified order. (D, E, A, B, C, F)

What am I supposed to be seeing here?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

GSquared SSC Guru Points: 260824 More actions · Answer 14

Wasn't presupposing that you'd see anything there. Was asking if you'd tried that. You just did, and it worked as you said it would. I hadn't had a chance to test that and was wondering if you had.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon