clustered constraint vs clustered index

Question

Post reply

clustered constraint vs clustered index

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply

guarddata SSCertifiable Points: 6583 More actions · Answer 1

OK - here is some code in case I did anything wrong.

CREATE TABLE taLocation1 (

LocIDBIGINT NOT NULL,--Identify the Location

LocCodeVARCHAR(10) NOT NULL,--Human friendly code

LocDescVARCHAR(100) NULL,--Description of the Location

CONSTRAINT PK_taLocation1 PRIMARY KEY ( LocID )--Clustered Index on Primary Key

)

GO

CREATE UNIQUE INDEX IX_taLocByCode ON taLocation1 ( LocCode )

GO

CREATE TABLE taLocation2 (

LocIDBIGINT NOT NULL,--Identify the Location

LocCodeVARCHAR(10) NOT NULL,--Human friendly code

LocDescVARCHAR(100) NULL,--Description of the Location

CONSTRAINT PK_taLocation2 PRIMARY KEY ( LocID ),--Clustered Index on Primary Key

CONSTRAINT taLocCode_Unique UNIQUE NONCLUSTERED ( LocCode )

)

GO

--Routine to fill the two tables with values of 1 through 9999

DECLARE @loopVar INT, @aCode VARCHAR(10), @aDesc VARCHAR(50)

SET @loopVar = 0

WHILE @loopVar < 10000

BEGIN

--Code is the last 4 digits - zeros prepended for fun

SELECT @loopVar = @loopVar + 1,

@aCode = RIGHT( CONVERT( VARCHAR(5), 10000 + @loopVar ), 4 ),

@aDesc = 'Description for# ' + CONVERT( VARCHAR(5), @loopVar )

INSERT INTO taLocation1 ( LocID, LocCode, LocDesc )

VALUES ( @loopVar, @aCode, @aDesc )

INSERT INTO taLocation2 ( LocID, LocCode, LocDesc )

VALUES ( @loopVar, @aCode, @aDesc )

END

GO

-- NOW the testing can begin

--Routine to check for repeatedly failing insertions

DECLARE @stTime DATETIME, @loopVar INT, @aCode VARCHAR(10), @aDesc VARCHAR(50),

@newID INT, @errCnt INT

SET NOCOUNT ON

SELECT @loopVar = 550, @errCnt = 0, @stTime = GETDATE()

WHILE @loopVar < 1550

BEGIN

SELECT @loopVar = @loopVar + 1,

@newID = @loopVar + 10000,--to avoid duplicate ID values

@aCode = RIGHT( CONVERT( VARCHAR(5), 10000 + @loopVar ), 4 ),--Duplicate Code

@aDesc = 'Description for# ' + CONVERT( VARCHAR(5), @loopVar )

--NOTE - change this to taLocation2 to test difference between INDEX and CONSTRAINT

--Stop and restart SQL Service between each test

--Code Set A

INSERT INTO taLocation1 ( LocID, LocCode, LocDesc )

VALUES ( @loopVar, @aCode, @aDesc )

IF @@ERROR <> 0

SET @errCnt = @errCnt + 1

--Code Set B

--IF ( SELECT COUNT(*) FROM taLocation1 WHERE LocCode = @aCode ) = 0

-- INSERT INTO taLocation1 ( LocID, LocCode, LocDesc )

-- VALUES ( @loopVar, @aCode, @aDesc )

--ELSE

-- SET @errCnt = @errCnt + 1

END

SELECT DATEDIFF( MS, @stTime, GETDATE()) 'Milliseconds', @errCnt 'Errors'

Let me know if I have overlooked anything that makes the test invalid.

Guarddata-

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 2

In my opinion, this is not really a question of performance. Yes, performance does differ, but the real difference is that a constraint is a logical construction, whilst an index is a physical implementation. Therefore I think a UNIQUE CONSTRAINT is always preferable to a UNIQUE INDEX, since this defines an integrity constraint for the logical model. If your looking at the logical model of a database (which is really all you should need to do if you're a programmer that needs to use the database) you would have no idea that there is a unique index on a table and you could very well write code that wouldn't work. A database user should never need to be concerned with the physical implementation.

A while back I showed in a discussion http://www.sqlservercentral.com/forum/topic.asp?TOPIC_ID=8907&FORUM_ID=8&CAT_ID=1&Topic_Title=UNIQUE%20constraint%20with%20multiple%20NULLs&Forum_Title=T-SQL that SQL Server actually handles unique constrains incorrectly, probably because it actually uses a unique index to enforce it, when it really should use a unique predicate, which however doesn't exist in SQL Server.

--

Chris Hedgate @ Apptus Technologies (http://www.apptus.se)

http://www.sql.nu

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

tomiz SSCommitted Points: 1842 More actions · Answer 3

Now, I don't know which one to believe. But Antares has 4000+ posts, so I'm going to stick with unique constraint. That's my logic.

And I can't believe I missed that part in BOL. I printed 10 pages from SQL BOL 7. But obviously I didn't print the overview page!

Antares686 SSC Guru Points: 125444 More actions · Answer 4

Hey Tomiz 4000+ doesn't mean I am right, I do make mistakes.

-----------------------------------------

Hey Chris, relaized I had lost you thread along the way. I check around and had come across this http://support.microsoft.com/default.aspx?scid=kb;en-us;322002

However I think this was an MS choice not really compliance with SQL-99 that causes "Unique Constraints" on SQL to behave this way.

-----------------------------------------

Guarddata as for your test it is valid but not neccessarily accurate. I think the big thing I would think of as a problem with it is the data cached in memory and most times you would not be doing a loop to INSERT data in a main table so that data may have time to purge from memory and HD reads will occurr, this is especially true for very large tables and servers with heavy loads. I think if you freed the cache and buffers between each loop you might see a difference. I do however agree that prechecking is a good way to deal with potential errors so you can handle your own way. But that means an extra read per process that in many causes is not needed. I prefer letting the error be thrown back unless I need to do something special with data that would cause an error. If all you are going to do is tell the client an error occurred then the extra read is actually going to hurt overall performance.

I know others will disagree but consider

50000 records that would be fine have to do an initial SELECT to validate then INSERT or UPDATE. But then along comes one that will fail and you see this in checking and return an error. And this is you traffic roughly every hour.

Perofrmance wise not doing the check will prevent 50000 extra reads the one fails do to not unique it will throw an error which the client will handle. Again error logic for non-specific needs is far better in the client on an after condition than doing a check and perform.

Even on a system that makes 10 changes a day you still are hurting performance even thou it is on a small scale.

Andy Warren SSC Guru Points: 119902 More actions · Answer 5

Having 4000+ posts doesn't mean I'm always right either? Bummer!

Andy

http://www.sqlservercentral.com/columnists/awarren/

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter

tomiz SSCommitted Points: 1842 More actions · Answer 6

I'm saving these two pages in my personal collection. I think I've saved hundreds of internet pages over a couple of years now just because I was afraid that they would dissapear. Hm, if I made a search engine it would propably be bigger than Google

/Tomi

guarddata SSCertifiable Points: 6583 More actions · Answer 7

Don't worry Tomi - I have the greatest respect for Antares, Andy and a host of others.

Antares - you are right about the extra reads and that is an important consideration. However, even if I change the test to fail only once in 1000, checking first is still faster. Once in 50000 would probably be lower as you said.

This discussion is probably more applicable to Foreign Key Constraints than Unique keys. We found that removing the FK and adding the code to our stored procedures improved our throughput tremendously. Probably also has a little to do with the data.

Anyway - this little horse is too beat for me. My apologies for muddying the waters.

Guarddata-

Antares686 SSC Guru Points: 125444 More actions · Answer 8

Thank you GuardData for the kind words.

And I don't think you muddied the water at all. You gave a valid point and I just countered. Your solution is common amoung many developers and I have often questioned does checking first this way have all upsides or are there downsides and what might they be. Also, again many choices are purely personal in style and processing methods but none-the-less valid and have potential for good DB management and bad.

Always contribute because as a whole we learn that way.

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 9

quote:
Hey Chris, relaized I had lost you thread along the way. I check around and had come across this http://support.microsoft.com/default.aspx?scid=kb;en-us;322002
However I think this was an MS choice not really compliance with SQL-99 that causes "Unique Constraints" on SQL to behave this way.

Antares, thanks for that KB, strange that I hadn't found it the last time we where discussing this. Although it doesn't state why MS chose to go against the standard, it does show some nice ways to 'fake' a (standard) unique constriant. I especially liked the second solution (computed column) since it allows me to actually define a unique constraint in the logical design level.

Regarding this discussion, anyone have any comments on my thoughts regarding this not being a performance issue, but rather an issue of designing a correct logical design?

--

Chris Hedgate @ Apptus Technologies (http://www.apptus.se)

http://www.sql.nu

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

Antares686 SSC Guru Points: 125444 More actions · Answer 10

quote:
Regarding this discussion, anyone have any comments on my thoughts regarding this not being a performance issue, but rather an issue of designing a correct logical design?

I would say correct and incorrect.

Correct in that it does pertain to logical design of a DB but incorrect in it not being performance related.

The reason is all indexing choices are performance related. However, does one of these actually and truely have more performance impact than others. If so, specifically what.

When I suggested that Constraint works like any other constraint being evaluated before the insert takes place then you are getting results faster with less server overhead.

Proper index choices always are considered more of a performance impact than a design issue.

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 11

quote:
Proper index choices always are considered more of a performance impact than a design issue.

Ah, OK, that I agree with. What I meant was that I don't see the performance differences (whichever is better) as an issue, I would always choose unique constraint because it is a logical construct, thereby showing the users that there is a constraint in place. With only a unique index on the column(s), users would have no idea from looking at the logical schema of a table that there is a restriction on what rows can be inserted. If, as I said in the above referenced discussion, SQL Server would have implemented unique constraints using a unique predicate, without any index used (at least not seen by users), the distinction would have been clear. These are entirely different 'things'. I really don't see any reason to use unique indexes except for the automatic ones used for unique constraints. Someone could of course come up with some exception to this, but generally I wouldn't use 'manually created' unique indexes.

--

Chris Hedgate @ Apptus Technologies (http://www.apptus.se)

http://www.sql.nu

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

kevingreiner SSC Rookie Points: 25 More actions · Answer 12

Guarddata wrote:

quote:
This discussion is probably more applicable to Foreign Key Constraints than Unique keys. We found that removing the FK and adding the code to our stored procedures improved our throughput tremendously. Probably also has a little to do with the data.

You actually removed the FK constraints? Didn't that mess up your logical model?

The db that I'm working with at the moment has very few FK constraints (they are implemented in the sprocs) and it's very hard to write reports due to the relationships not being immediately obvious.

Gabor Nyul SSCrazy Eights Points: 9459 More actions · Answer 13

Just some words of clarification:

quote:
mromm wrote:
A constraint is added to the column definition like this:
CREATE TABLE table_name (
abc int NOT NULL
CONSTRAINT constraint_name UNIQUE CLUSTERED
)
An index is created separately from the table definition like this:
CREATE CLUSTERED UNIQUE INDEX index_name ON table_name( abc )

Of course you can define the constraint separatly from the table definition as well with the syntax: alter table xxx add constraint...

Bye
Gabor

mcrist Valued Member Points: 62 More actions · Answer 14

Just to clarify for SQL Server 2000,

You can also provide the filegroup for the primary key constraint at the time you create it. Earlier post indicated you might

only be able to do this with an index create.

ALTER TABLE <TABLE_NAME>

ADD <WITH CHECK|NOCHECK>

CONSTRAINT <CONSTRAINT_NAME>

PRIMARY KEY <CLUSTERED|NONCLUSTERED>

ON <FILEGROUP>

Also, be aware that if you create a clustered

constraint or index, it will:

1) If you omit the filegroup, be placed in the same filegroup as the table's filegroup.

2) If you specify a filegroup for the PK that is different than the table's filegroup, the table will be moved to the filegroup you specify for the PK.

larryd SSCommitted Points: 1669 More actions · Answer 15

I don't think there's much difference between the two, because SQL Server actually creates a clustered index to enforce a clustered unique constraint. Create a clustered unique constraint, then query sysindexes by the constraint name. You'll see that an index was created with indid=1, i.e. a clustered index.