Duplicate Index Performance

  • Today, out of curiosity, I decided to check our databases for duplicate Indexes. I found several tables on each dB that had duplicates and was wondering how much of a performance hit is caused by each. I know I should investigate further and remove them, as they provide no advantages and only help bring down performance, but I had a few questions before I did that.

    -- What are the performance hits incurred by having duplicate Non-Clustered indexes on a table? Is there any easy way to measure this?

    -- Would you ever want to have a Non-Clustered and Clustered Index that are the same? For some reason, I inherited several tables that have NC and C indexes that only reference one column.

    As of right now, our performance isn't bad by any means, but I am starting to look for ways to fine-tune everything to get it all running even smoother. As of right now, I see no reason to keep these indexes and have the following reasons why they should be deleted"

    -- Duplicate Indexes take up physical space

    -- Performance hits on simple table commands (Update, Insert, etc..)

    -- Maintaining dB takes longer (backup, re-index, etc.)

    -- T-Log fills up faster

    Just wanted to know if I was thinking along the right-track of if I should just leave well enough alone.

    Thanks in advance,

  • upstart (10/27/2010)


    -- What are the performance hits incurred by having duplicate Non-Clustered indexes on a table? Is there any easy way to measure this?

    Ur... It depends.

    Set up a test environment, take stats with the dups, take stats without

    -- Would you ever want to have a Non-Clustered and Clustered Index that are the same? For some reason, I inherited several tables that have NC and C indexes that only reference one column.

    There are cases where you might. They aren't common. Point is that the cluster is the largest index on the table (because it includes all columns). It can, maybe, in certain cases be advantageous to also have narrow index on that column. Not common, probably more an edge case with very specific performance requirements

    As of right now, I see no reason to keep these indexes and have the following reasons why they should be deleted"

    -- Duplicate Indexes take up physical space

    -- Performance hits on simple table commands (Update, Insert, etc..)

    -- Maintaining dB takes longer (backup, re-index, etc.)

    -- T-Log fills up faster

    Also backups are bigger, more space required for them.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (10/27/2010)


    Point is that the cluster is the largest index on the table (because it includes all columns).

    Always ? What if i include only one column like empid in EMPLOYEE table ?

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Bhuvnesh (10/28/2010)


    GilaMonster (10/27/2010)


    Point is that the cluster is the largest index on the table (because it includes all columns).

    Always ? What if i include only one column like empid in EMPLOYEE table ?

    Please go and revise what a clustered index is. Also note that you cannot manually include a column in a clustered index. It should be obvious why.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • @gail, Here i didnt mean "inlcude column" i meant to say , if i have clustered index on one column "Empid" . then how it would consider other non-participated columns.?

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Hint: What makes a clustered index different from a nonclustered index?

    Yes, it may be possible with work to make a nonclustered index larger than the cluster, it's not something that you'll find in the vast majority of cases, it'll take work.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (10/28/2010)


    Hint: What makes a clustered index different from a nonclustered index?

    Yes, it may be possible with work to make a nonclustered index larger than the cluster, it's not something that you'll find in the vast majority of cases, it'll take work.

    Curiously, Gail, how to make the nonclustered index larger than the clustered index? :unsure:

  • Wildcat (11/11/2010)


    GilaMonster (10/28/2010)


    Hint: What makes a clustered index different from a nonclustered index?

    Yes, it may be possible with work to make a nonclustered index larger than the cluster, it's not something that you'll find in the vast majority of cases, it'll take work.

    Curiously, Gail, how to make the nonclustered index larger than the clustered index? :unsure:

    Index non-persisted calculated columns, where the calculation is smaller than the result.

    For example:

    IF OBJECT_ID(N'dbo.IndexTest', 'u') IS NOT NULL

    DROP TABLE dbo.IndexTest;

    GO

    CREATE TABLE dbo.IndexTest (

    ID INT IDENTITY PRIMARY KEY CLUSTERED,

    ColA TINYINT,

    ColB INT,

    ColC AS ColA + ColB);

    GO

    CREATE INDEX IDX_CBA ON dbo.IndexTest (ColC, ColB, ColA, ID);

    GO

    INSERT INTO dbo.IndexTest (ColA, ColB)

    SELECT N1.Number, N2.Number

    FROM DBA.dbo.Numbers AS N1

    CROSS JOIN DBA.dbo.Numbers AS N2

    WHERE N1.Number <= 255

    AND N2.Number <= 100;

    When I ran that, the clustered index ended up taking 57 pages, while the non-clustered took 60. Bigger examples should be easy enough to come by in bigger tables.

    Another way is to have a multi-column clustered index, and to include those same columns in a non-clustered index. Since the clustered index is automatically included, duplicating the columns can multiply the storage space.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (11/11/2010)


    Another way is to have a multi-column clustered index, and to include those same columns in a non-clustered index. Since the clustered index is automatically included, duplicating the columns can multiply the storage space.

    SQL's smarter than that. If you put the clustered index key into an index, SQL won't put it there a second time. I think I need a blog post on this. Common misconception.

    The way I was thinking was to have a small clustering key then a nonclustered index on a really large set of columns (touching the 900 bytes limit) and include every single other column in the table.

    Leaf level should be about the same size as the cluster (because both have all columns), but the non-leaf levels will be larger (wider key) resulting in a nonclustered index larger than the cluster. Does take work though

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Proof:

    CREATE TABLE TestingIndexes (

    ID int identity not null,

    SomeCol Char(10),

    RandDate datetime,

    Filler char(200)

    )

    create unique clustered index idx_Testing_ID on TestingIndexes (ID, SomeCol)

    create nonclustered index idx_Testing_Wide on TestingIndexes (RandDate, SomeCol, ID) -- entire cluster in key

    insert into TestingIndexes (SomeCol, RandDate, Filler)

    select left(a.name,10),b.create_date, ''

    from sys.columns a cross join sys.objects b

    go

    -- investigation

    select db_id(), object_id('TestingIndexes') -- use results for following

    dbcc ind(11,103007448,1)

    dbcc ind(11,103007448,2)

    -- want a leaf page from both (so index level 0, page type 1 in the cluster, page type 2 in the nonclustered)

    dbcc traceon (3604)

    dbcc page (11,1,135252,3) -- clustered index

    dbcc page (11,1,138050,3) -- nonclustered index

    dbcc traceoff (3604)

    Portion of output of the dbcc page of the cluster:

    Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

    ID = 176

    Slot 0 Column 2 Offset 0x8 Length 10 Length (physical) 10

    SomeCol = fromsvc

    Slot 0 Column 3 Offset 0x12 Length 8 Length (physical) 8

    RandDate = 2008-08-27 10:13:36.723

    Slot 0 Column 4 Offset 0x1a Length 200 Length (physical) 200

    Filler =

    And the noncluster:

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It's possible the indexes you want to delete are hard-coded as index hints. I use something like this to look for character strings in stored procedures

    exec sp_MSforeachdb_db 'use [?] select ''?'' as dbName, name as ''SP_Name'',

    substring(definition,CHARINDEX(''IndexName'', definition)-25,150) as ''Code_Snippet''

    from sys.sql_modules com

    join sysobjects obj on com.object_id = obj.id

    where (definition like ''IndexName'')

    order by dbName, name'

  • Now let's see about that large index...

    Setup:

    CREATE TABLE TestingIndexes (

    ID int identity not null primary key,

    SomeCol Char(8) not null,

    RandDate datetime not null,

    Filler1 char(880) not null,

    Filler2 char(500)

    )

    create nonclustered index idx_Testing_Wide

    on TestingIndexes (RandDate, SomeCol, ID, Filler1) -- and that makes 900 bytes exactly

    include (Filler2)

    insert into TestingIndexes (SomeCol, RandDate, Filler1, Filler2)

    select left(a.name,8),b.create_date, ' ',''

    from sys.columns a cross join sys.objects b

    go

    And to test index size:

    select db_id(), object_id('TestingIndexes') -- use for following

    select * from sys.dm_db_index_physical_stats(11,199007790,null, null, 'detailed')

    The cluster had 19071 index pages - 19001 leaf, 70 at level 1 and 1 root page. A 3 level deep index

    The noncluster had 25330 index pages - 19001 pages (same as the cluster, as would be expected), 4750 at level 1, 1187 at level 2, 296 at level 3, ... and finally the root page at level 7. An 8 level deep index!

    As I said, this takes work. Having a noncluster with a key wider than the cluster is fairly normal. Having one that also includes every single other column in the table is generally just concerning.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (11/11/2010)


    GSquared (11/11/2010)


    Another way is to have a multi-column clustered index, and to include those same columns in a non-clustered index. Since the clustered index is automatically included, duplicating the columns can multiply the storage space.

    SQL's smarter than that. If you put the clustered index key into an index, SQL won't put it there a second time. I think I need a blog post on this. Common misconception.

    The way I was thinking was to have a small clustering key then a nonclustered index on a really large set of columns (touching the 900 bytes limit) and include every single other column in the table.

    Leaf level should be about the same size as the cluster (because both have all columns), but the non-leaf levels will be larger (wider key) resulting in a nonclustered index larger than the cluster. Does take work though

    SQL's smarter than that if the second index has the clustered index columns in the same order as they are in the clustered index, and adjacent to each other. Have you tried a clustered index on A, B, C, and a non-clustered on D, A, E, C, F, B?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (11/11/2010)


    SQL's smarter than that if the second index has the clustered index columns in the same order as they are in the clustered index, and adjacent to each other.

    Not sure I follow you...

    Have you tried a clustered index on A, B, C, and a non-clustered on D, A, E, C, F, B?

    Create Table TestingIndex2 (

    A int,

    B int,

    C int,

    D int,

    E int,

    F int

    )

    CREATE UNIQUE CLUSTERED INDEX idx_1 on TestingIndex2 (A, B, C)

    CREATE NONCLUSTERED INDEX idx_2 on TestingIndex2 (D, A, E, C, F, B)

    INSERT INTO TestingIndex2

    SELECT Row_Number() Over (Order by a.name) a,

    Row_Number() Over (Order by b.name) b,

    Row_Number() Over (Order by a.object_id) c,

    Row_Number() Over (Order by a.type) d,

    Row_Number() Over (Order by a.name) e,

    Row_Number() Over (Order by b.name) f

    from sys.objects a cross join sys.objects b

    -- (24649 row(s) affected)

    GO

    select object_id('TestingIndex2')

    dbcc ind (11,295008132,1)

    dbcc ind (11,295008132,2)

    dbcc traceon (3604)

    dbcc page (11,1,135279,3)

    dbcc traceoff (3604)

    102 index pages in the clustered index (plus 1 IAM)

    93 index pages in the nonclustered index (plus 1 IAM)

    Nonclustered leaf page has the 6 index columns in the specified order - D, A, E, C, F, B

    A second nonclustered index with the clustering key adjacent and in order specified in clustering key shows same behaviour

    CREATE NONCLUSTERED INDEX idx_3 on TestingIndex2 (D, E, A, B, C, F)

    dbcc ind (11,295008132,3)

    dbcc traceon (3604)

    dbcc page (11,1,249185,3)

    dbcc traceoff (3604)

    Again 93 index pages + 1 IAM and all 6 columns listed in leaf page, in specified order. (D, E, A, B, C, F)

    What am I supposed to be seeing here?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Wasn't presupposing that you'd see anything there. Was asking if you'd tried that. You just did, and it worked as you said it would. I hadn't had a chance to test that and was wondering if you had.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply