Primary Key fields as an Included Columns in a Composite Index

Question

Primary Key fields as an Included Columns in a Composite Index

clausm73

Mr or Mrs. 500

Points: 516
More actions
March 8, 2011 at 7:29 am

#233687

Hi there,
I am seeking a general advise on whether it makes sense (or not) to include the fields of a primary key on a table to be in the INCLUDED COLUMNS on a composite index. To the best of my knowledge this seems like "overhead" as the composite index implicitly is "build" upon the clustered index - in this case the primary key.
The reason I ask is that we have tables with duplicate address info (and tons of other fields) and we want to perform a selection returning all the person id's where the address occurs more than once across the table.
USE [tempdb]
GO
/****** Object: Table [dbo].[Leads] Script Date: 03/08/2011 15:15:43 ******/
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Leads]') AND type in (N'U'))
DROP TABLE [dbo].[Leads]
GO
USE [tempdb]
GO
/****** Object: Table [dbo].[Leads] Script Date: 03/08/2011 15:15:43 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Leads](
[LeadId] [bigint] IDENTITY(1,1) NOT NULL,
[Email] [nvarchar](50) NULL,
[Name] [nvarchar](60) NULL,
[Address] [nvarchar](100) NULL,
[ZipCode] [nvarchar](20) NULL,
CONSTRAINT [PK_Leads] PRIMARY KEY CLUSTERED
(
[LeadId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
GO
USE [tempdb]
/****** Object: Index [IX_Leads_Address] Script Date: 03/08/2011 15:15:43 ******/
CREATE NONCLUSTERED INDEX [IX_Leads_Address] ON [dbo].[Leads]
(
[Address] ASC,
[ZipCode] ASC
)
INCLUDE ( [LeadId]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
GO
INSERT INTO [tempdb].[dbo].[Leads] (
[Name]
,[Address]
,[ZipCode]
)
SELECT 'Peter', 'Cumbersome Road 101', 'DK-1111'
UNION
SELECT 'Liz', 'Cumbersome Road 101', 'DK-1111'
UNION
SELECT 'Mike', 'East Street 508', 'DK-1234'
GO
WITH dedup
AS
(
SELECT
[DedupKey] = ROW_NUMBER() OVER (
PARTITION BY
[Address]
,[ZipCode]
ORDER BY
[LeadId]
)
, *
FROM [dbo].[Leads]
)
SELECT
*
FROM dedup
WHERE
[DedupKey] > 1
The reason I ask is that according to general recommendations for composite indexes it is beneficial to have the non-key columns in the "SELECT..." clause added to the index as INCLUDED COLUMN to avoid having to scan the tables data pages. But does this also apply when returning the primary key values?
Regards,
Claus Thorning Madsen

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Jim Murphy SSCertifiable Points: 6263 More actions · Answer 1

Right, the PK is already included in every NCI in a table with a clustered index, so the include is not needed/desired.

See the est exec plan with the include and without and notice they are the same (no clustered index scan/seek, no bookmark, etc).

Jim

Jim Murphy
http://www.sqlwatchmen.com
@SQLMurph

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 2

Sure. Why would it not?

SQL's not stupid enough to add them twice if you explicitly specify them. Also, what happens if the clustered index is moved?

btw, it's the clustered index columns that are relevant, not the primary key columns. They are not the same.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 3

[Jim].[dba].[Murphy] (3/8/2011)
Right, the PK is already included in every NCI in a table with a clustered index, so the include is not needed/desired.

The clustered index key columns are included in every nonclustered index. Not the primary key columns. Primary key != clustered index.

Why is it not desired?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

clausm73 Mr or Mrs. 500 Points: 516 More actions · Answer 4

Hi again,

I know that a Primary Key is not the same as a clustered index. But in my current scenario they are the same. That is: [LeadId] is a primary key with a clustered index upon it.

I know that if [LeadId] had not been the clustered index, then it would make a lot of sense to add it to my INCLUDED COLUMNS list for the composite index to avoid going for the data pages for the table to get the values for the field.

But when the field IS the clustered index I will now conclude that having the field in the INCLUDED COLUMNS list for the composite index only makes the index "wider" without adding any beneficial performance. It only makes my index maintenance "heavier". Correct?

Regards,

Claus

Jim Murphy SSCertifiable Points: 6263 More actions · Answer 5

Gail,

Boy, I'm battin a million today. I misspoke and meant the CI value, not the PK value. I definitely know the difference.

I need to measure twice and cut once today. Er. Everyday. 🙂

Jim Murphy
http://www.sqlwatchmen.com
@SQLMurph

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 6

clausm73 (3/8/2011)
But when the field IS the clustered index I will now conclude that having the field in the INCLUDED COLUMNS list for the composite index only makes the index "wider" without adding any beneficial performance. It only makes my index maintenance "heavier". Correct?

Incorrect.

The column is already there implicitly. Explicitly specifying it will NOT result in it being in the index twice. SQL is not that stupid. It knows that the column it's implicitly adding is the same as the one you've specified and hence it will not add it a second time.

I recommend explicitly adding the clustered key column if it is needed. One for clarity, so that you don't have to look at two indexes to see if the NC index covers the query or not. The second for 'future-proofing' so that if someone in the future moves the clustered index the nonclustered does not suddenly become not covering for that important query

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

clausm73 Mr or Mrs. 500 Points: 516 More actions · Answer 7

Thank you both.

Now I finally understood it - and also learned that a more appropriate subject title would have been "Clustered Index fields as an Included Columns in a Composite Index".

Gail, I like your point ajour having the field explicitly added to the composite index. This will be my "Best Practise" recommendation to my colleagues going forward.

Thanks again for the clarifications.

Claus Thorning Madsen

Jim Murphy SSCertifiable Points: 6263 More actions · Answer 8

clausm73 (3/8/2011)
Gail, I like your point ajour having the field explicitly added to the composite index. This will be my "Best Practise" recommendation to my colleagues going forward.

I agree. I took note of this great tip as well. It makes two handfuls of sense. Thanks Gail!

Jim

Jim Murphy
http://www.sqlwatchmen.com
@SQLMurph

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 9

GilaMonster (3/8/2011)
I recommend explicitly adding the clustered key column if it is needed. One for clarity, so that you don't have to look at two indexes to see if the NC index covers the query or not. The second for 'future-proofing' so that if someone in the future moves the clustered index the nonclustered does not suddenly become not covering for that important query

Thanks Gail, as usual. As much as I've batted around the NC/CI inclusion routine, I think I just realized why about 6/7 years ago I broke a client's system when I rebuilt a CI in a different way on a massive Fact table due to your comment. The system was too critical to allow me to test why at the time.

I believe I'll be following this practice going forward for NC indexes that require the CI involved... for the next poor soul.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA