October 27, 2010 at 9:48 am
Today, out of curiosity, I decided to check our databases for duplicate Indexes. I found several tables on each dB that had duplicates and was wondering how much of a performance hit is caused by each. I know I should investigate further and remove them, as they provide no advantages and only help bring down performance, but I had a few questions before I did that.
-- What are the performance hits incurred by having duplicate Non-Clustered indexes on a table? Is there any easy way to measure this?
-- Would you ever want to have a Non-Clustered and Clustered Index that are the same? For some reason, I inherited several tables that have NC and C indexes that only reference one column.
As of right now, our performance isn't bad by any means, but I am starting to look for ways to fine-tune everything to get it all running even smoother. As of right now, I see no reason to keep these indexes and have the following reasons why they should be deleted"
-- Duplicate Indexes take up physical space
-- Performance hits on simple table commands (Update, Insert, etc..)
-- Maintaining dB takes longer (backup, re-index, etc.)
-- T-Log fills up faster
Just wanted to know if I was thinking along the right-track of if I should just leave well enough alone.
Thanks in advance,
October 27, 2010 at 12:53 pm
upstart (10/27/2010)
-- What are the performance hits incurred by having duplicate Non-Clustered indexes on a table? Is there any easy way to measure this?
Ur... It depends.
Set up a test environment, take stats with the dups, take stats without
-- Would you ever want to have a Non-Clustered and Clustered Index that are the same? For some reason, I inherited several tables that have NC and C indexes that only reference one column.
There are cases where you might. They aren't common. Point is that the cluster is the largest index on the table (because it includes all columns). It can, maybe, in certain cases be advantageous to also have narrow index on that column. Not common, probably more an edge case with very specific performance requirements
As of right now, I see no reason to keep these indexes and have the following reasons why they should be deleted"
-- Duplicate Indexes take up physical space
-- Performance hits on simple table commands (Update, Insert, etc..)
-- Maintaining dB takes longer (backup, re-index, etc.)
-- T-Log fills up faster
Also backups are bigger, more space required for them.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 28, 2010 at 6:21 am
GilaMonster (10/27/2010)
Point is that the cluster is the largest index on the table (because it includes all columns).
Always ? What if i include only one column like empid in EMPLOYEE table ?
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
October 28, 2010 at 8:25 am
Bhuvnesh (10/28/2010)
GilaMonster (10/27/2010)
Point is that the cluster is the largest index on the table (because it includes all columns).Always ? What if i include only one column like empid in EMPLOYEE table ?
Please go and revise what a clustered index is. Also note that you cannot manually include a column in a clustered index. It should be obvious why.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 28, 2010 at 9:38 pm
@gail, Here i didnt mean "inlcude column" i meant to say , if i have clustered index on one column "Empid" . then how it would consider other non-participated columns.?
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
October 28, 2010 at 11:17 pm
Hint: What makes a clustered index different from a nonclustered index?
Yes, it may be possible with work to make a nonclustered index larger than the cluster, it's not something that you'll find in the vast majority of cases, it'll take work.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 11, 2010 at 12:18 pm
GilaMonster (10/28/2010)
Hint: What makes a clustered index different from a nonclustered index?Yes, it may be possible with work to make a nonclustered index larger than the cluster, it's not something that you'll find in the vast majority of cases, it'll take work.
Curiously, Gail, how to make the nonclustered index larger than the clustered index? :unsure:
November 11, 2010 at 12:37 pm
Wildcat (11/11/2010)
GilaMonster (10/28/2010)
Hint: What makes a clustered index different from a nonclustered index?Yes, it may be possible with work to make a nonclustered index larger than the cluster, it's not something that you'll find in the vast majority of cases, it'll take work.
Curiously, Gail, how to make the nonclustered index larger than the clustered index? :unsure:
Index non-persisted calculated columns, where the calculation is smaller than the result.
For example:
IF OBJECT_ID(N'dbo.IndexTest', 'u') IS NOT NULL
DROP TABLE dbo.IndexTest;
GO
CREATE TABLE dbo.IndexTest (
ID INT IDENTITY PRIMARY KEY CLUSTERED,
ColA TINYINT,
ColB INT,
ColC AS ColA + ColB);
GO
CREATE INDEX IDX_CBA ON dbo.IndexTest (ColC, ColB, ColA, ID);
GO
INSERT INTO dbo.IndexTest (ColA, ColB)
SELECT N1.Number, N2.Number
FROM DBA.dbo.Numbers AS N1
CROSS JOIN DBA.dbo.Numbers AS N2
WHERE N1.Number <= 255
AND N2.Number <= 100;
When I ran that, the clustered index ended up taking 57 pages, while the non-clustered took 60. Bigger examples should be easy enough to come by in bigger tables.
Another way is to have a multi-column clustered index, and to include those same columns in a non-clustered index. Since the clustered index is automatically included, duplicating the columns can multiply the storage space.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 11, 2010 at 1:01 pm
GSquared (11/11/2010)
Another way is to have a multi-column clustered index, and to include those same columns in a non-clustered index. Since the clustered index is automatically included, duplicating the columns can multiply the storage space.
SQL's smarter than that. If you put the clustered index key into an index, SQL won't put it there a second time. I think I need a blog post on this. Common misconception.
The way I was thinking was to have a small clustering key then a nonclustered index on a really large set of columns (touching the 900 bytes limit) and include every single other column in the table.
Leaf level should be about the same size as the cluster (because both have all columns), but the non-leaf levels will be larger (wider key) resulting in a nonclustered index larger than the cluster. Does take work though
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 11, 2010 at 1:22 pm
Proof:
CREATE TABLE TestingIndexes (
ID int identity not null,
SomeCol Char(10),
RandDate datetime,
Filler char(200)
)
create unique clustered index idx_Testing_ID on TestingIndexes (ID, SomeCol)
create nonclustered index idx_Testing_Wide on TestingIndexes (RandDate, SomeCol, ID) -- entire cluster in key
insert into TestingIndexes (SomeCol, RandDate, Filler)
select left(a.name,10),b.create_date, ''
from sys.columns a cross join sys.objects b
go
-- investigation
select db_id(), object_id('TestingIndexes') -- use results for following
dbcc ind(11,103007448,1)
dbcc ind(11,103007448,2)
-- want a leaf page from both (so index level 0, page type 1 in the cluster, page type 2 in the nonclustered)
dbcc traceon (3604)
dbcc page (11,1,135252,3) -- clustered index
dbcc page (11,1,138050,3) -- nonclustered index
dbcc traceoff (3604)
Portion of output of the dbcc page of the cluster:
Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4
ID = 176
Slot 0 Column 2 Offset 0x8 Length 10 Length (physical) 10
SomeCol = fromsvc
Slot 0 Column 3 Offset 0x12 Length 8 Length (physical) 8
RandDate = 2008-08-27 10:13:36.723
Slot 0 Column 4 Offset 0x1a Length 200 Length (physical) 200
Filler =
And the noncluster:
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 11, 2010 at 1:35 pm
It's possible the indexes you want to delete are hard-coded as index hints. I use something like this to look for character strings in stored procedures
exec sp_MSforeachdb_db 'use [?] select ''?'' as dbName, name as ''SP_Name'',
substring(definition,CHARINDEX(''IndexName'', definition)-25,150) as ''Code_Snippet''
from sys.sql_modules com
join sysobjects obj on com.object_id = obj.id
where (definition like ''IndexName'')
order by dbName, name'
November 11, 2010 at 1:50 pm
Now let's see about that large index...
Setup:
CREATE TABLE TestingIndexes (
ID int identity not null primary key,
SomeCol Char(8) not null,
RandDate datetime not null,
Filler1 char(880) not null,
Filler2 char(500)
)
create nonclustered index idx_Testing_Wide
on TestingIndexes (RandDate, SomeCol, ID, Filler1) -- and that makes 900 bytes exactly
include (Filler2)
insert into TestingIndexes (SomeCol, RandDate, Filler1, Filler2)
select left(a.name,8),b.create_date, ' ',''
from sys.columns a cross join sys.objects b
go
And to test index size:
select db_id(), object_id('TestingIndexes') -- use for following
select * from sys.dm_db_index_physical_stats(11,199007790,null, null, 'detailed')
The cluster had 19071 index pages - 19001 leaf, 70 at level 1 and 1 root page. A 3 level deep index
The noncluster had 25330 index pages - 19001 pages (same as the cluster, as would be expected), 4750 at level 1, 1187 at level 2, 296 at level 3, ... and finally the root page at level 7. An 8 level deep index!
As I said, this takes work. Having a noncluster with a key wider than the cluster is fairly normal. Having one that also includes every single other column in the table is generally just concerning.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 11, 2010 at 2:43 pm
GilaMonster (11/11/2010)
GSquared (11/11/2010)
Another way is to have a multi-column clustered index, and to include those same columns in a non-clustered index. Since the clustered index is automatically included, duplicating the columns can multiply the storage space.SQL's smarter than that. If you put the clustered index key into an index, SQL won't put it there a second time. I think I need a blog post on this. Common misconception.
The way I was thinking was to have a small clustering key then a nonclustered index on a really large set of columns (touching the 900 bytes limit) and include every single other column in the table.
Leaf level should be about the same size as the cluster (because both have all columns), but the non-leaf levels will be larger (wider key) resulting in a nonclustered index larger than the cluster. Does take work though
SQL's smarter than that if the second index has the clustered index columns in the same order as they are in the clustered index, and adjacent to each other. Have you tried a clustered index on A, B, C, and a non-clustered on D, A, E, C, F, B?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 11, 2010 at 3:37 pm
GSquared (11/11/2010)
SQL's smarter than that if the second index has the clustered index columns in the same order as they are in the clustered index, and adjacent to each other.
Not sure I follow you...
Have you tried a clustered index on A, B, C, and a non-clustered on D, A, E, C, F, B?
Create Table TestingIndex2 (
A int,
B int,
C int,
D int,
E int,
F int
)
CREATE UNIQUE CLUSTERED INDEX idx_1 on TestingIndex2 (A, B, C)
CREATE NONCLUSTERED INDEX idx_2 on TestingIndex2 (D, A, E, C, F, B)
INSERT INTO TestingIndex2
SELECT Row_Number() Over (Order by a.name) a,
Row_Number() Over (Order by b.name) b,
Row_Number() Over (Order by a.object_id) c,
Row_Number() Over (Order by a.type) d,
Row_Number() Over (Order by a.name) e,
Row_Number() Over (Order by b.name) f
from sys.objects a cross join sys.objects b
-- (24649 row(s) affected)
GO
select object_id('TestingIndex2')
dbcc ind (11,295008132,1)
dbcc ind (11,295008132,2)
dbcc traceon (3604)
dbcc page (11,1,135279,3)
dbcc traceoff (3604)
102 index pages in the clustered index (plus 1 IAM)
93 index pages in the nonclustered index (plus 1 IAM)
Nonclustered leaf page has the 6 index columns in the specified order - D, A, E, C, F, B
A second nonclustered index with the clustering key adjacent and in order specified in clustering key shows same behaviour
CREATE NONCLUSTERED INDEX idx_3 on TestingIndex2 (D, E, A, B, C, F)
dbcc ind (11,295008132,3)
dbcc traceon (3604)
dbcc page (11,1,249185,3)
dbcc traceoff (3604)
Again 93 index pages + 1 IAM and all 6 columns listed in leaf page, in specified order. (D, E, A, B, C, F)
What am I supposed to be seeing here?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 12, 2010 at 9:42 am
Wasn't presupposing that you'd see anything there. Was asking if you'd tried that. You just did, and it worked as you said it would. I hadn't had a chance to test that and was wondering if you had.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply