Primary Key fields as an Included Columns in a Composite Index

  • Hi there,

    I am seeking a general advise on whether it makes sense (or not) to include the fields of a primary key on a table to be in the INCLUDED COLUMNS on a composite index. To the best of my knowledge this seems like "overhead" as the composite index implicitly is "build" upon the clustered index - in this case the primary key.

    The reason I ask is that we have tables with duplicate address info (and tons of other fields) and we want to perform a selection returning all the person id's where the address occurs more than once across the table.

    USE [tempdb]

    GO

    /****** Object: Table [dbo].[Leads] Script Date: 03/08/2011 15:15:43 ******/

    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Leads]') AND type in (N'U'))

    DROP TABLE [dbo].[Leads]

    GO

    USE [tempdb]

    GO

    /****** Object: Table [dbo].[Leads] Script Date: 03/08/2011 15:15:43 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[Leads](

    [LeadId] [bigint] IDENTITY(1,1) NOT NULL,

    [Email] [nvarchar](50) NULL,

    [Name] [nvarchar](60) NULL,

    [Address] [nvarchar](100) NULL,

    [ZipCode] [nvarchar](20) NULL,

    CONSTRAINT [PK_Leads] PRIMARY KEY CLUSTERED

    (

    [LeadId] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    USE [tempdb]

    /****** Object: Index [IX_Leads_Address] Script Date: 03/08/2011 15:15:43 ******/

    CREATE NONCLUSTERED INDEX [IX_Leads_Address] ON [dbo].[Leads]

    (

    [Address] ASC,

    [ZipCode] ASC

    )

    INCLUDE ( [LeadId]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]

    GO

    INSERT INTO [tempdb].[dbo].[Leads] (

    [Name]

    ,[Address]

    ,[ZipCode]

    )

    SELECT 'Peter', 'Cumbersome Road 101', 'DK-1111'

    UNION

    SELECT 'Liz', 'Cumbersome Road 101', 'DK-1111'

    UNION

    SELECT 'Mike', 'East Street 508', 'DK-1234'

    GO

    WITH dedup

    AS

    (

    SELECT

    [DedupKey] = ROW_NUMBER() OVER (

    PARTITION BY

    [Address]

    ,[ZipCode]

    ORDER BY

    [LeadId]

    )

    , *

    FROM [dbo].[Leads]

    )

    SELECT

    *

    FROM dedup

    WHERE

    [DedupKey] > 1

    The reason I ask is that according to general recommendations for composite indexes it is beneficial to have the non-key columns in the "SELECT..." clause added to the index as INCLUDED COLUMN to avoid having to scan the tables data pages. But does this also apply when returning the primary key values?

    Regards,

    Claus Thorning Madsen

  • Right, the PK is already included in every NCI in a table with a clustered index, so the include is not needed/desired.

    See the est exec plan with the include and without and notice they are the same (no clustered index scan/seek, no bookmark, etc).

    Jim

    Jim Murphy
    http://www.sqlwatchmen.com
    @SQLMurph

  • Sure. Why would it not?

    SQL's not stupid enough to add them twice if you explicitly specify them. Also, what happens if the clustered index is moved?

    btw, it's the clustered index columns that are relevant, not the primary key columns. They are not the same.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • [Jim].[dba].[Murphy] (3/8/2011)


    Right, the PK is already included in every NCI in a table with a clustered index, so the include is not needed/desired.

    The clustered index key columns are included in every nonclustered index. Not the primary key columns. Primary key != clustered index.

    Why is it not desired?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi again,

    I know that a Primary Key is not the same as a clustered index. But in my current scenario they are the same. That is: [LeadId] is a primary key with a clustered index upon it.

    I know that if [LeadId] had not been the clustered index, then it would make a lot of sense to add it to my INCLUDED COLUMNS list for the composite index to avoid going for the data pages for the table to get the values for the field.

    But when the field IS the clustered index I will now conclude that having the field in the INCLUDED COLUMNS list for the composite index only makes the index "wider" without adding any beneficial performance. It only makes my index maintenance "heavier". Correct?

    Regards,

    Claus

  • Gail,

    Boy, I'm battin a million today. I misspoke and meant the CI value, not the PK value. I definitely know the difference.

    I need to measure twice and cut once today. Er. Everyday. 🙂

    Jim Murphy
    http://www.sqlwatchmen.com
    @SQLMurph

  • clausm73 (3/8/2011)


    But when the field IS the clustered index I will now conclude that having the field in the INCLUDED COLUMNS list for the composite index only makes the index "wider" without adding any beneficial performance. It only makes my index maintenance "heavier". Correct?

    Incorrect.

    The column is already there implicitly. Explicitly specifying it will NOT result in it being in the index twice. SQL is not that stupid. It knows that the column it's implicitly adding is the same as the one you've specified and hence it will not add it a second time.

    I recommend explicitly adding the clustered key column if it is needed. One for clarity, so that you don't have to look at two indexes to see if the NC index covers the query or not. The second for 'future-proofing' so that if someone in the future moves the clustered index the nonclustered does not suddenly become not covering for that important query

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you both.

    Now I finally understood it - and also learned that a more appropriate subject title would have been "Clustered Index fields as an Included Columns in a Composite Index".

    Gail, I like your point ajour having the field explicitly added to the composite index. This will be my "Best Practise" recommendation to my colleagues going forward.

    Thanks again for the clarifications.

    Claus Thorning Madsen

  • clausm73 (3/8/2011)


    Gail, I like your point ajour having the field explicitly added to the composite index. This will be my "Best Practise" recommendation to my colleagues going forward.

    I agree. I took note of this great tip as well. It makes two handfuls of sense. Thanks Gail!

    Jim

    Jim Murphy
    http://www.sqlwatchmen.com
    @SQLMurph

  • GilaMonster (3/8/2011)


    I recommend explicitly adding the clustered key column if it is needed. One for clarity, so that you don't have to look at two indexes to see if the NC index covers the query or not. The second for 'future-proofing' so that if someone in the future moves the clustered index the nonclustered does not suddenly become not covering for that important query

    Thanks Gail, as usual. As much as I've batted around the NC/CI inclusion routine, I think I just realized why about 6/7 years ago I broke a client's system when I rebuilt a CI in a different way on a massive Fact table due to your comment. The system was too critical to allow me to test why at the time.

    I believe I'll be following this practice going forward for NC indexes that require the CI involved... for the next poor soul.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply