February 27, 2015 at 12:01 pm
Hello,
After reading some comments here I decided to look at tables to see if any had a clustered index that was a unique identifier. Yep. So if I have a table with a unique identifier as the primary key/clustered index and an identity column that is indexed, I would like to make the identity a clustered index (maybe even the primary key) and make the unique identifier a unique non-clustered index (not the primary key).
Does this sound reasonable?
If I do this will I need to drop and recreate the other indexes? Or maybe just rebuild the other indexes?
Currently:
CREATE TABLE Payments (
IDX INT IDENTITY(1,1) NOT NULL,
GUID UNIQUEIDENTIFIER NOT NULL DEFAULT(NEWID()),
..... -- many other columns
);
GO
ALTER TABLE [dbo].[PAYMENTS] ADD CONSTRAINT [PK_PAYMENTS_GID] PRIMARY KEY CLUSTERED ([GUID] ASC);
GO
CREATE NONCLUSTERED INDEX [IX_Payments_ID] ON [dbo].[PAYMENTS] ([IDX] ASC);
GO
Would like:
ALTER TABLE [dbo].[PAYMENTS] ADD CONSTRAINT [PK_PAYMENTS_IDX] PRIMARY KEY CLUSTERED (IDX ASC);
GO
CREATE UNIQUE NONCLUSTERED INDEX [IX_Payments_GUID] ON [dbo].[PAYMENTS] (GUID ASC);
GO
Thank you
February 27, 2015 at 12:45 pm
February 27, 2015 at 12:50 pm
Before making such a critical choice, review SQL's missing index and index usage stats. Odds are, you have a better choice than an identity column for the clustering index key. That is, more than 50% of the time, overall performance is better clustering on something other than just a meaningless identity value.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 27, 2015 at 1:00 pm
I'm not a fan of GUIDs, but they are not inherently evil. Before you replace it, I'd find out why you have it. If there's no good reason, then by all means, rip it out. But be sure there's not a good purpose there.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 27, 2015 at 1:09 pm
The only unique things in the table are IDX and GUID. Using a combination of columns it is still not possible to get a unique combination. Since this table has been around for about ten years, it is not normalized, but is used everywhere. 🙁
February 27, 2015 at 1:12 pm
Grant Fritchey (2/27/2015)
I'm not a fan of GUIDs, but they are not inherently evil. Before you replace it, I'd find out why you have it. If there's no good reason, then by all means, rip it out. But be sure there's not a good purpose there.
I will keep the GUID column but just change from primary/clustered to unique non-clustered index.
The other option is to change the clustered index and keep the GUID the primary key.
February 27, 2015 at 1:20 pm
djj (2/27/2015)
The only unique things in the table are IDX and GUID. Using a combination of columns it is still not possible to get a unique combination. Since this table has been around for about ten years, it is not normalized, but is used everywhere. 🙁
Your clustered index key(s) doesn't(don't) have to be unique by itself(themselves); SQL will take care of that if it needs to.
And that's a secondary concern. The most critical performance factor is that you get the best clustered index for that table.
This code will show you the unused index stats and the index usage stats for the table(s).
--USE [<your_db_name_here>] --naturally make sure you are in the right db
SET DEADLOCK_PRIORITY LOW --probably irrelevant, but just in case
DECLARE @list_missing_indexes bit
DECLARE @table_name_pattern sysname
--NOTE: showing missing indexes can take some time; set to 0 if you don't want to wait.
SET @list_missing_indexes = 1
SET @table_name_pattern = '%' --<<-- !your table name/pattern goes here!
--SET @table_name_pattern = '%'
PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)
IF @list_missing_indexes = 1
BEGIN
SELECT
GETDATE() AS capture_date,
DB_NAME(mid.database_id) AS Db_Name,
dps.row_count,
OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
mid.equality_columns, mid.inequality_columns, mid.included_columns,
user_seeks, user_scans, ca1.max_days_active, unique_compiles,
last_user_seek, last_user_scan, avg_total_user_cost, avg_user_impact,
system_seeks, system_scans, last_system_seek, last_system_scan, avg_total_system_cost, avg_system_impact,
mid.statement, mid.object_id, mid.index_handle
FROM sys.dm_db_missing_index_details mid WITH (NOLOCK)
CROSS APPLY (
SELECT DATEDIFF(DAY, create_date, GETDATE()) AS max_days_active FROM sys.databases WHERE name = 'tempdb'
) AS ca1
LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON
mig.index_handle = mid.index_handle
LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON
migs.group_handle = mig.index_group_handle
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = mid.object_id AND
dps.index_id IN (0, 1)
--order by
--DB_NAME, Table_Name, equality_columns
WHERE
1 = 1
AND mid.database_id = DB_ID() --only current db
AND OBJECT_NAME(mid.object_id) LIKE @table_name_pattern
--AND mid.object_id IN (OBJECT_ID('<table_name_1>'), OBJECT_ID('<table_name_2>'))
ORDER BY
--avg_total_user_cost * (user_seeks + user_scans) DESC,
Db_Name, Table_Name, equality_columns, inequality_columns
END --IF
PRINT 'Midpoint @ ' + CONVERT(varchar(30), GETDATE(), 120)
-- list index usage stats (seeks, scans, etc.)
SELECT
ius2.row_num, DB_NAME() AS db_name,
CASE WHEN i.name LIKE ca2.table_name + '%'
THEN '~' + SUBSTRING(i.name, LEN(ca2.table_name) + 1, 200)
ELSE i.name END AS index_name,
CASE WHEN i.is_unique = 0 THEN 'N' ELSE 'Y' END + '.' +
CASE WHEN i.is_primary_key = 0 AND i.is_unique_constraint = 0 THEN 'N' ELSE 'Y' END AS [uniq?],
ca2.table_name,
i.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
dps.row_count,
SUBSTRING(key_cols, 3, 8000) AS key_cols, SUBSTRING(nonkey_cols, 3, 8000) AS nonkey_cols,
ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
fk.Reference_Count AS fk_ref_count,
DATEDIFF(DAY, CASE WHEN o.create_date > ca1.sql_startup_date THEN o.create_date
ELSE ca1.sql_startup_date END, GETDATE()) AS max_days_active,
FILEGROUP_NAME(i.data_space_id) AS filegroup_name,
ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
o.object_id = i.object_id
CROSS JOIN (
SELECT create_date AS sql_startup_date FROM sys.databases WHERE name = 'tempdb'
) AS ca1
CROSS APPLY (
SELECT OBJECT_NAME(i.object_id/*, DB_ID()*/) AS table_name
) AS ca2
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal > 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
ic.key_ordinal
FOR XML PATH('')
) AS key_cols (key_cols)
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal = 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
COL_NAME(object_id, ic.column_id)
FOR XML PATH('')
) AS nonkey_cols (nonkey_cols)
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = i.object_id AND
dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON
ius.database_id = DB_ID() AND
ius.object_id = i.object_id AND
ius.index_id = i.index_id
LEFT OUTER JOIN (
SELECT
database_id, object_id, MAX(user_scans) AS user_scans,
ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans
FROM sys.dm_db_index_usage_stats WITH (NOLOCK)
WHERE
database_id = DB_ID()
--AND index_id > 0
GROUP BY
database_id, object_id
) AS ius2 ON
ius2.database_id = DB_ID() AND
ius2.object_id = i.object_id
LEFT OUTER JOIN (
SELECT
referenced_object_id, COUNT(*) AS Reference_Count
FROM sys.foreign_keys
WHERE
is_disabled = 0
GROUP BY
referenced_object_id
) AS fk ON
fk.referenced_object_id = i.object_id
WHERE
i.object_id > 100 AND
i.is_hypothetical = 0 AND
i.type IN (0, 1, 2) AND
o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND
(
o.name LIKE @table_name_pattern AND
o.name NOT LIKE 'dtprop%' AND
o.name NOT LIKE 'filestream[_]' AND
o.name NOT LIKE 'MSpeer%' AND
o.name NOT LIKE 'MSpub%' AND
--o.name NOT LIKE 'queue[_]%' AND
o.name NOT LIKE 'sys%'
)
--AND OBJECT_NAME(i.object_id /*, DB_ID()*/) IN ('tbl1', 'tbl2', 'tbl3')
ORDER BY
--row_count DESC,
--ius.user_scans DESC,
--ius2.row_num, --user_scans&|user_seeks
db_name, table_name,
-- list clustered index first, if any, then other index(es)
CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END,
key_cols
PRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)
SET DEADLOCK_PRIORITY NORMAL
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 27, 2015 at 1:27 pm
djj (2/27/2015)
Grant Fritchey (2/27/2015)
I'm not a fan of GUIDs, but they are not inherently evil. Before you replace it, I'd find out why you have it. If there's no good reason, then by all means, rip it out. But be sure there's not a good purpose there.I will keep the GUID column but just change from primary/clustered to unique non-clustered index.
The other option is to change the clustered index and keep the GUID the primary key.
My initial comment was about that GUID being a Cluster Index, not dropping the GUID, if your app really uses it. A GUID as a Cluster Index is bad! The fragmentation will kill you.
If that GUID should be a unique NCI or PK, it's up to you. They internally act pretty much the same, except that primary key column cannot be nullable, unique index column can, I think. But for some third party apps and if you are trying to adhere to the relational model, a PK should be deployed in SQL as an actual PK.
February 27, 2015 at 1:41 pm
Thanks everyone.
Scott, I will try your code later as I am getting ready to leave. 😀
February 27, 2015 at 2:51 pm
ScottPletcher (2/27/2015)
Your clustered index key(s) doesn't(don't) have to be unique by itself(themselves); SQL will take care of that if it needs to.
... at possibly quite some cost if the clustered index isn't unique thanks to the "uniquefier" column that will appear for non-unique rows.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 27, 2015 at 3:02 pm
Jeff Moden (2/27/2015)
ScottPletcher (2/27/2015)
Your clustered index key(s) doesn't(don't) have to be unique by itself(themselves); SQL will take care of that if it needs to.... at possibly quite some cost if the clustered index isn't unique thanks to the "uniquefier" column that will appear for non-unique rows.
Or at virtually no cost, depending on what % of rows have dups, how many dups there are, and how many rows fit on a page. But, again, all those are secondary concerns compared to the primary concern of getting the best clustering index key(s).
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 27, 2015 at 3:47 pm
ScottPletcher (2/27/2015)
Jeff Moden (2/27/2015)
ScottPletcher (2/27/2015)
Your clustered index key(s) doesn't(don't) have to be unique by itself(themselves); SQL will take care of that if it needs to.... at possibly quite some cost if the clustered index isn't unique thanks to the "uniquefier" column that will appear for non-unique rows.
Or at virtually no cost, depending on what % of rows have dups, how many dups there are, and how many rows fit on a page. But, again, all those are secondary concerns compared to the primary concern of getting the best clustering index key(s).
I disagree, Scott. Those should all be primary concerns for selecting what the best clustered index is. As you say, there may be no cost to using a non-unique clustered index... or there may be a huge cost that could actually be repaired fairly easily (using methods you hate, oddly enough) but, if you don't know, then any change is a shot in the dark especially on large tables.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 27, 2015 at 3:50 pm
Jeff Moden (2/27/2015)
ScottPletcher (2/27/2015)
Jeff Moden (2/27/2015)
ScottPletcher (2/27/2015)
Your clustered index key(s) doesn't(don't) have to be unique by itself(themselves); SQL will take care of that if it needs to.... at possibly quite some cost if the clustered index isn't unique thanks to the "uniquefier" column that will appear for non-unique rows.
Or at virtually no cost, depending on what % of rows have dups, how many dups there are, and how many rows fit on a page. But, again, all those are secondary concerns compared to the primary concern of getting the best clustering index key(s).
I disagree, Scott. Those should all be primary concerns for selecting what the best clustered index is. As you say, there may be no cost to using a non-unique clustered index... or there may be a huge cost that could actually be repaired fairly easily (using methods you hate, oddly enough) but, if you don't know, then any change is a shot in the dark especially on large tables.
Don't be so obsessed with the INSERT process. Each row is Selected 100x or more for its 1 INSERT. Thus, I believe getting the best clustered index keys is first; you can deal with duplicates, if necessary, later, except perhaps on very large tables. Often avoiding gazillions of covering indexes will gain you much more than you lose by duplicate key values.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 27, 2015 at 3:57 pm
ScottPletcher (2/27/2015)
Jeff Moden (2/27/2015)
ScottPletcher (2/27/2015)
Jeff Moden (2/27/2015)
ScottPletcher (2/27/2015)
Your clustered index key(s) doesn't(don't) have to be unique by itself(themselves); SQL will take care of that if it needs to.... at possibly quite some cost if the clustered index isn't unique thanks to the "uniquefier" column that will appear for non-unique rows.
Or at virtually no cost, depending on what % of rows have dups, how many dups there are, and how many rows fit on a page. But, again, all those are secondary concerns compared to the primary concern of getting the best clustering index key(s).
I disagree, Scott. Those should all be primary concerns for selecting what the best clustered index is. As you say, there may be no cost to using a non-unique clustered index... or there may be a huge cost that could actually be repaired fairly easily (using methods you hate, oddly enough) but, if you don't know, then any change is a shot in the dark especially on large tables.
Don't be so obsessed with the INSERT process. Each row is Selected 100x or more for its 1 INSERT. Thus, I believe getting the best clustered index keys is first; you can deal with duplicates, if necessary, later, except perhaps on very large tables. Often avoiding gazillions of covering indexes will gain you much more than you lose by duplicate key values.
You say that but I've seen systems that have been instantly and absolutely paralyzed just by someone adding the wrong index. My personal example is that I single handedly paralyzed a part of Expedia.com for a short period just by adding a the wrong index that was more concerned with SELECT performance than INSERT performance.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 27, 2015 at 4:00 pm
Jeff Moden (2/27/2015)
ScottPletcher (2/27/2015)
Jeff Moden (2/27/2015)
ScottPletcher (2/27/2015)
Jeff Moden (2/27/2015)
ScottPletcher (2/27/2015)
Your clustered index key(s) doesn't(don't) have to be unique by itself(themselves); SQL will take care of that if it needs to.... at possibly quite some cost if the clustered index isn't unique thanks to the "uniquefier" column that will appear for non-unique rows.
Or at virtually no cost, depending on what % of rows have dups, how many dups there are, and how many rows fit on a page. But, again, all those are secondary concerns compared to the primary concern of getting the best clustering index key(s).
I disagree, Scott. Those should all be primary concerns for selecting what the best clustered index is. As you say, there may be no cost to using a non-unique clustered index... or there may be a huge cost that could actually be repaired fairly easily (using methods you hate, oddly enough) but, if you don't know, then any change is a shot in the dark especially on large tables.
Don't be so obsessed with the INSERT process. Each row is Selected 100x or more for its 1 INSERT. Thus, I believe getting the best clustered index keys is first; you can deal with duplicates, if necessary, later, except perhaps on very large tables. Often avoiding gazillions of covering indexes will gain you much more than you lose by duplicate key values.
You say that but I've seen systems that have been instantly and absolutely paralyzed just by someone adding the wrong index. My personal example is that I single handedly paralyzed a part of Expedia.com for a short period just by adding a the wrong index that was more concerned with SELECT performance than INSERT performance.
Can't believe an index on a single key value that is explicitly specified in every user request could be "bad" or "wrong". Seems bizarre to me to consider clustering on anything else given that info going on, barring extremely unusual insert patterns, except based on a pre-determined obsession with using only "narrow, ever-increasing, ..." clustering key values.
I've also not had an index based on careful analysis of missing index and index usage stats, and a few other stats I review, cause me such an issue.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 15 posts - 1 through 15 (of 26 total)
You must be logged in to reply to this topic. Login to reply