September 14, 2012 at 4:34 pm
how to find duplicate indexes in a table ,example a nonclustered index names FX_LOCATION_ID is created on columns a,b and again there is a nonclustered index named FX_LOCATION_ID1 on the same columns a,b in a table.Now i want to find the duplicate indexes on all the tables in a database,does anybody has any sql script in handy that i can use.
September 16, 2012 at 7:34 pm
I think that there are probably faster and better ways, but I think this is one way:
;WITH IndexColumns AS (
SELECT DISTINCT a.object_id, a.name, a.type_desc, b.column_id, TABLE_NAME=c.name, [COL NAME]=d.name
FROM sys.indexes a
INNER JOIN sys.index_columns b ON a.object_id = b.object_id AND a.index_id = b.index_id
INNER JOIN sys.tables c ON b.object_id = c.object_id
INNER JOIN sys.columns d ON c.object_id = d.object_id AND b.column_id = d.column_id
WHERE is_hypothetical = 0
),
CombineCols AS (
SELECT object_id, name, type_desc, table_name
,columns=STUFF((
SELECT ',' + [COL NAME]
FROM IndexColumns b
WHERE a.object_id = b.object_id AND
a.name = b.name AND
a.type_desc = b.type_desc AND
a.TABLE_NAME = b.TABLE_NAME
ORDER BY [COL NAME]
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'), 1, 1, '')
FROM IndexColumns a
GROUP BY object_id, name, type_desc, table_name)
SELECT b.type_desc, a.table_name, a.columns, b.name
FROM (
SELECT table_name, columns
FROM CombineCols
GROUP BY table_name, columns
HAVING COUNT(name) > 1) a
INNER JOIN CombineCols b
ON a.table_name = b.table_name AND
a.columns = b.columns
ORDER BY a.table_name, a.columns
It is not particularly fast but give it a chance and see what it gets you.
Edit: Improved the query speed, added remarks below and removed "hypothetical" indexes.
Remarks:
1) If a CLUSTERED index exists on the same fields as a non-CLUSTERED index, both are listed.
2) If two (or more) indexes exist involving the same fields but where the fields are ordered differently, these are also listed.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 17, 2012 at 7:20 am
thanks dwain that was very helpful and the query you provided is perfect for my scenario that i mentioned.
September 17, 2012 at 10:42 am
Hi,
The query doesn't differentiate a "normal" index column from an "include" index column...
In following statements idx_dupIndexes_03 and idx_dupIndexes_06 are considered duplicate but they're not.
CREATE TABLE dupIndexes (id1 INT, id2 INT, id3 INT, id4 INT)
CREATE INDEX idx_dupIndexes_01 ON dupIndexes (id1, id2)
CREATE INDEX idx_dupIndexes_02 ON dupIndexes (id2, id1)
CREATE INDEX idx_dupIndexes_03 ON dupIndexes (id2, id1) INCLUDE (id3)
CREATE INDEX idx_dupIndexes_04 ON dupIndexes (id2, id1) INCLUDE (id3, id4)
CREATE INDEX idx_dupIndexes_05 ON dupIndexes (id2, id1) INCLUDE (id4, id3)
CREATE INDEX idx_dupIndexes_06 ON dupIndexes (id2, id1, id3)
One more question, and in this one I might be saying nonsense...
In the following statements:
CREATE TABLE dupIndexes2 (id1 INT, id2 INT, id3 INT, id4 INT)
CREATE INDEX idx_dupIndexes2_01 ON dupIndexes2 (id1, id2)
CREATE INDEX idx_dupIndexes2_02 ON dupIndexes2 (id1, id2, id3)
should both indexes be considered equal since 02 has all the columns 01 has?
When I run "SELECT id1 FROM dupIndexes2 WHERE id1 = 1 AND id2 = 2" the sql uses idx_dupIndexes2_02 (according to the execution plan).
Thks,
Pedro
September 17, 2012 at 6:25 pm
PiMané (9/17/2012)
Hi,The query doesn't differentiate a "normal" index column from an "include" index column...
In following statements idx_dupIndexes_03 and idx_dupIndexes_06 are considered duplicate but they're not.
CREATE TABLE dupIndexes (id1 INT, id2 INT, id3 INT, id4 INT)
CREATE INDEX idx_dupIndexes_01 ON dupIndexes (id1, id2)
CREATE INDEX idx_dupIndexes_02 ON dupIndexes (id2, id1)
CREATE INDEX idx_dupIndexes_03 ON dupIndexes (id2, id1) INCLUDE (id3)
CREATE INDEX idx_dupIndexes_04 ON dupIndexes (id2, id1) INCLUDE (id3, id4)
CREATE INDEX idx_dupIndexes_05 ON dupIndexes (id2, id1) INCLUDE (id4, id3)
CREATE INDEX idx_dupIndexes_06 ON dupIndexes (id2, id1, id3)
One more question, and in this one I might be saying nonsense...
In the following statements:
CREATE TABLE dupIndexes2 (id1 INT, id2 INT, id3 INT, id4 INT)
CREATE INDEX idx_dupIndexes2_01 ON dupIndexes2 (id1, id2)
CREATE INDEX idx_dupIndexes2_02 ON dupIndexes2 (id1, id2, id3)
should both indexes be considered equal since 02 has all the columns 01 has?
When I run "SELECT id1 FROM dupIndexes2 WHERE id1 = 1 AND id2 = 2" the sql uses idx_dupIndexes2_02 (according to the execution plan).
Thks,
Pedro
Perhaps I should have noted that the query I provided was just something I threw together.
The two points you mention are probably valid for consideration. I wouldn't want to use the results of the query to simply DROP indexes that the script says are duplicates. A little analysis should be performed on the results to decide what is appropriate to DROP and what not.
Having said that though, the INCLUDEs can probably be resolved. An interesting point that I may look into, as I happen to need this for something else anyway.
I believe in the second case, the two indexes would not be considered duplicates (by my script). I believe I would call that a case of "overlapping indexes" rather than identical ones. A case that could probably be handled if needed.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 17, 2012 at 8:16 pm
achtro (9/17/2012)
thanks dwain that was very helpful and the query you provided is perfect for my scenario that i mentioned.
You are most welcome. As I said above, it turns out I needed something like this anyway. So the effort has served both of our purposes. 🙂
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 18, 2012 at 2:28 am
dwain.c (9/17/2012)
Perhaps I should have noted that the query I provided was just something I threw together.The two points you mention are probably valid for consideration. I wouldn't want to use the results of the query to simply DROP indexes that the script says are duplicates. A little analysis should be performed on the results to decide what is appropriate to DROP and what not.
Having said that though, the INCLUDEs can probably be resolved. An interesting point that I may look into, as I happen to need this for something else anyway.
Hi,
By no means I meant to question your query.. just wanted to alert to the fact that the include columns aren't considered as include but as normal columns.
I took the liberty to change the query so it does that.
WITH IndexColumns AS (
SELECT DISTINCT a.object_id, a.name, a.type_desc, b.column_id, TABLE_NAME=c.name, [COL NAME]=d.name, b.is_included_column
FROM sys.indexes a
INNER JOIN sys.index_columns b ON a.object_id = b.object_id AND a.index_id = b.index_id
INNER JOIN sys.tables c ON b.object_id = c.object_id
INNER JOIN sys.columns d ON c.object_id = d.object_id AND b.column_id = d.column_id
WHERE is_hypothetical = 0
),
CombineCols AS (
SELECT object_id, name, type_desc, table_name
,columns=STUFF((
SELECT ',' + [COL NAME]
FROM IndexColumns b
WHERE a.object_id = b.object_id AND
a.name = b.name AND
a.type_desc = b.type_desc AND
a.TABLE_NAME = b.TABLE_NAME AND b.is_included_column = 0
ORDER BY [COL NAME]
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'), 1, 1, '')
,include_columns=ISNULL(STUFF((
SELECT ',' + [COL NAME]
FROM IndexColumns b
WHERE a.object_id = b.object_id AND
a.name = b.name AND
a.type_desc = b.type_desc AND
a.TABLE_NAME = b.TABLE_NAME AND b.is_included_column = 1
ORDER BY [COL NAME]
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'), 1, 1, ''), '')
FROM IndexColumns a
GROUP BY object_id, name, type_desc, table_name)
SELECT b.type_desc, a.table_name, a.columns, a.include_columns, b.name
FROM (
SELECT table_name, columns, include_columns
FROM CombineCols
GROUP BY table_name, columns, include_columns
HAVING COUNT(name) > 1) a
INNER JOIN CombineCols b
ON a.table_name = b.table_name AND
a.columns = b.columns AND
a.include_columns = b.include_columns
ORDER BY a.table_name, a.columns
dwain.c (9/17/2012)
I believe in the second case, the two indexes would not be considered duplicates (by my script). I believe I would call that a case of "overlapping indexes" rather than identical ones. A case that could probably be handled if needed.
They are not considered duplicates, my question is completely out of this "subject"... A table with an index with (col1, col2) and another index with (col1, col2, col3) according to the execution plan always uses the 2nd index... Shouldn't "sub-indexes" be considered also duplicates?!
Thanks,
Pedro
September 18, 2012 at 6:25 am
Nice work on the query mod. Like I said, I figured that info must be somewhere in those tables I SELECTed against.
On your second point, I guess it depends on what your requirements are. If the query plan always chooses one over the other, it would make sense to at least mark these some way so they can be identified and analyzed later. I can't confirm whether that is always the case.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply