Indexing and Primary Keys

Question

Indexing and Primary Keys

Sioban Krzywicki

One Orange Chip

Points: 27770
More actions
September 14, 2010 at 1:51 pm

#229291

If I have one table with a Primary Key on columns B and C
CREATE TABLE Main
(B uniqueidentifier NOT NULL,
C int NOT NULL,
D varchar(24) NULL,
E bit NOT NULL,
CONSTRAINT MainKey PRIMARY KEY CLUSTERED (B, C))
and a second table with a Primary Key on columns A, B and C
CREATE TABLE Sub
(A int NOT NULL,
B uniqueidentifier NOT NULL,
C int NOT NULL,
F varchar(42) NULL,
G smallmoney NOT NULL,
H bit NULL SPARSE,
CONSTRAINT SubKey PRIMARY KEY CLUSTERED(A, B, C),
CONSTRAINT FK_Sub_Main FOREIGN KEY (B, C) REFERENCES Main(B, C))
The second table has a Foreign Key to the first on B and C
I then query both tables
SELECT M.B, M.C, M.E, S.F, S.G
FROM Main M
INNER JOIN Sub S
ON M.B = S.B
AND M.C = S.C
Should I have an index on B and C on the second table?
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

WayneS SSC Guru Points: 95460 More actions · Answer 1

The way it's written, yes. However, if you can make the PK be on (B, C, A), then you wouldn't need to.

Edit: also, the index on B,C isn't for just the query. When you delete from the first table, it has to verify that there aren't any child records in the second table. Having an index on those columns will make that faster also.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Sioban Krzywicki One Orange Chip Points: 27770 More actions · Answer 2

WayneS (9/14/2010)
The way it's written, yes. However, if you can make the PK be on (B, C, A), then you wouldn't need to.
Edit: also, the index on B,C isn't for just the query. When you delete from the first table, it has to verify that there aren't any child records in the second table. Having an index on those columns will make that faster also.

Just to make sure I have it right, you're saying if I make the second table

CREATE TABLE Sub

(A int NOT NULL,

B uniqueidentifier NOT NULL,

C int NOT NULL,

F varchar(42) NULL,

G smallmoney NOT NULL,

H bit NULL SPARSE,

CONSTRAINT SubKey PRIMARY KEY CLUSTERED(B, C, A),

CONSTRAINT FK_Sub_Main FOREIGN KEY (B, C) REFERENCES Main(B, C))

then I don't need an index on B, C on the second table for the stated query, but I should have one there anyway for checks on the table for deletes from the main table.

If I add an index on the Sub table on columns B and C, should I still reconfigure hte Primary Key to B, C, A?

--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Stefan Krzywicki (9/14/2010)
then I don't need an index on B, C on the second table for the stated query, but I should have one there anyway for checks on the table for deletes from the main table.

No, because an index on B, C will be redundant with the PK, being a left-based subset of the index key

If I add an index on the Sub table on columns B and C, should I still reconfigure hte Primary Key to B, C, A?

Do one or the other. Unless there are queries that filter only on A, it's probably more efficient to re-order the PK than to add a second index.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

HowardW One Orange Chip Points: 29109 More actions · Answer 4

Just a general point, you should be careful using a uniqueidentifier as part of a Clustered Index. Due to the random nature of Unique Identifiers, you can end up with a lot of fragmentation as new rows can be inserted at any location within the clustered index, rather than at the end.

Sioban Krzywicki One Orange Chip Points: 27770 More actions · Answer 5

Thanks Gail, appreciate the advice. I'll reorder the PK today.

--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams

Sioban Krzywicki One Orange Chip Points: 27770 More actions · Answer 6

HowardW (9/15/2010)
Just a general point, you should be careful using a uniqueidentifier as part of a Clustered Index. Due to the random nature of Unique Identifiers, you can end up with a lot of fragmentation as new rows can be inserted at any location within the clustered index, rather than at the end.

Yeah, this one I know, but thanks! Always appreciate the help. Unfortunately, I'm stuck with uniqueidentifiers for the time being. I'm re-indexing the main table every day after the once-a-day bulk insert. I have to! Fragmentation in that table goes up to over 40% every day.

--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams

WayneS SSC Guru Points: 95460 More actions · Answer 7

One other point to make: the keys that make up the clustered index will be used in all other indexes. You might want to consider having an identity column that is the clustered PK, and then using an UNIQUE index on B,C,A. A uniqueidentifier in all indexes sounds to me like an awful lot of IO on updates, and more IO when querying due to wider pages, etc.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

WayneS (9/15/2010)
One other point to make: the keys that make up the clustered index will be used in all other indexes. You might want to consider having an identity column that is the clustered PK, and then using an UNIQUE index on B,C,A.

It is a little wider than I would typically recommend for a clustered index. That plus the uniqueidentifier... Failing on 2 of the 4 recommendations for a cluster at least.

That said, you can keep the 3 columns as the pk, just make it a nonclustered pk and put the cluster elsewhere.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Sioban Krzywicki One Orange Chip Points: 27770 More actions · Answer 9

GilaMonster (9/15/2010)
WayneS (9/15/2010)
One other point to make: the keys that make up the clustered index will be used in all other indexes. You might want to consider having an identity column that is the clustered PK, and then using an UNIQUE index on B,C,A.
It is a little wider than I would typically recommend for a clustered index. That plus the uniqueidentifier... Failing on 2 of the 4 recommendations for a cluster at least.
That said, you can keep the 3 columns as the pk, just make it a nonclustered pk and put the cluster elsewhere.

I normally stay away from multi-column keys. I don't like them at all. However, these tables are partitioned and you have to include a partitioned column in the primary key. At best, I could use an IDENTITY plus the partitioned column (B) as the PK and then a unique index on the original B, A. I'm also getting this data from a 3rd party that I have no control over, so I can't get rid of the uniqueidentifier or change the subtable key from the combination uniqueidentifier+int that it currently is. Perhaps in time I'll be able to shift that to a foreign key with a bigint replacing it, but not for the forseeable future.

Once we've finished the current upgrade, I'll bring up that possibility, but my manager won't want to change that yet because of all the underlying changes that would have to go along with it.

The good thing is that there are no tables that reference this table.

Do you have a link for the recommendations for a cluster that you mentioned? I'd love to read more on it.

--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

Stefan Krzywicki (9/15/2010)
Do you have a link for the recommendations for a cluster that you mentioned? I'd love to read more on it.

Is it a valid reference if I wrote it?

http://www.sqlservercentral.com/articles/Indexing/68563/

If you want my sources, try Kimberly Tripp's blog (http://www.sqlskills.com/blogs/kimberly) I'm sure it's there somewhere.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Sioban Krzywicki One Orange Chip Points: 27770 More actions · Answer 11

GilaMonster (9/15/2010)
Stefan Krzywicki (9/15/2010)
Do you have a link for the recommendations for a cluster that you mentioned? I'd love to read more on it.
Is it a valid reference if I wrote it?
http://www.sqlservercentral.com/articles/Indexing/68563/
If you want my sources, try Kimberly Tripp's blog (http://www.sqlskills.com/blogs/kimberly) I'm sure it's there somewhere.

I'd say it likely is. Thanks!

--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams