clustered constraint vs clustered index

  • OK - here is some code in case I did anything wrong.

    CREATE TABLE taLocation1 (

    LocIDBIGINT NOT NULL,--Identify the Location

    LocCodeVARCHAR(10) NOT NULL,--Human friendly code

    LocDescVARCHAR(100) NULL,--Description of the Location

    CONSTRAINT PK_taLocation1 PRIMARY KEY ( LocID )--Clustered Index on Primary Key

    )

    GO

    CREATE UNIQUE INDEX IX_taLocByCode ON taLocation1 ( LocCode )

    GO

    CREATE TABLE taLocation2 (

    LocIDBIGINT NOT NULL,--Identify the Location

    LocCodeVARCHAR(10) NOT NULL,--Human friendly code

    LocDescVARCHAR(100) NULL,--Description of the Location

    CONSTRAINT PK_taLocation2 PRIMARY KEY ( LocID ),--Clustered Index on Primary Key

    CONSTRAINT taLocCode_Unique UNIQUE NONCLUSTERED ( LocCode )

    )

    GO

    --Routine to fill the two tables with values of 1 through 9999

    DECLARE @loopVar INT, @aCode VARCHAR(10), @aDesc VARCHAR(50)

    SET @loopVar = 0

    WHILE @loopVar < 10000

    BEGIN

    --Code is the last 4 digits - zeros prepended for fun

    SELECT @loopVar = @loopVar + 1,

    @aCode = RIGHT( CONVERT( VARCHAR(5), 10000 + @loopVar ), 4 ),

    @aDesc = 'Description for# ' + CONVERT( VARCHAR(5), @loopVar )

    INSERT INTO taLocation1 ( LocID, LocCode, LocDesc )

    VALUES ( @loopVar, @aCode, @aDesc )

    INSERT INTO taLocation2 ( LocID, LocCode, LocDesc )

    VALUES ( @loopVar, @aCode, @aDesc )

    END

    GO

    -- NOW the testing can begin

    --Routine to check for repeatedly failing insertions

    DECLARE @stTime DATETIME, @loopVar INT, @aCode VARCHAR(10), @aDesc VARCHAR(50),

    @newID INT, @errCnt INT

    SET NOCOUNT ON

    SELECT @loopVar = 550, @errCnt = 0, @stTime = GETDATE()

    WHILE @loopVar < 1550

    BEGIN

    SELECT @loopVar = @loopVar + 1,

    @newID = @loopVar + 10000,--to avoid duplicate ID values

    @aCode = RIGHT( CONVERT( VARCHAR(5), 10000 + @loopVar ), 4 ),--Duplicate Code

    @aDesc = 'Description for# ' + CONVERT( VARCHAR(5), @loopVar )

    --NOTE - change this to taLocation2 to test difference between INDEX and CONSTRAINT

    --Stop and restart SQL Service between each test

    --Code Set A

    INSERT INTO taLocation1 ( LocID, LocCode, LocDesc )

    VALUES ( @loopVar, @aCode, @aDesc )

    IF @@ERROR <> 0

    SET @errCnt = @errCnt + 1

    --Code Set B

    --IF ( SELECT COUNT(*) FROM taLocation1 WHERE LocCode = @aCode ) = 0

    -- INSERT INTO taLocation1 ( LocID, LocCode, LocDesc )

    -- VALUES ( @loopVar, @aCode, @aDesc )

    --ELSE

    -- SET @errCnt = @errCnt + 1

    END

    SELECT DATEDIFF( MS, @stTime, GETDATE()) 'Milliseconds', @errCnt 'Errors'

    Let me know if I have overlooked anything that makes the test invalid.

    Guarddata-

  • In my opinion, this is not really a question of performance. Yes, performance does differ, but the real difference is that a constraint is a logical construction, whilst an index is a physical implementation. Therefore I think a UNIQUE CONSTRAINT is always preferable to a UNIQUE INDEX, since this defines an integrity constraint for the logical model. If your looking at the logical model of a database (which is really all you should need to do if you're a programmer that needs to use the database) you would have no idea that there is a unique index on a table and you could very well write code that wouldn't work. A database user should never need to be concerned with the physical implementation.

    A while back I showed in a discussion http://www.sqlservercentral.com/forum/topic.asp?TOPIC_ID=8907&FORUM_ID=8&CAT_ID=1&Topic_Title=UNIQUE%20constraint%20with%20multiple%20NULLs&Forum_Title=T-SQL that SQL Server actually handles unique constrains incorrectly, probably because it actually uses a unique index to enforce it, when it really should use a unique predicate, which however doesn't exist in SQL Server.

    --

    Chris Hedgate @ Apptus Technologies (http://www.apptus.se)

    http://www.sql.nu

  • Now, I don't know which one to believe. But Antares has 4000+ posts, so I'm going to stick with unique constraint. That's my logic.

    And I can't believe I missed that part in BOL. I printed 10 pages from SQL BOL 7. But obviously I didn't print the overview page!

  • Hey Tomiz 4000+ doesn't mean I am right, I do make mistakes.

    -----------------------------------------

    Hey Chris, relaized I had lost you thread along the way. I check around and had come across this http://support.microsoft.com/default.aspx?scid=kb;en-us;322002

    However I think this was an MS choice not really compliance with SQL-99 that causes "Unique Constraints" on SQL to behave this way.

    -----------------------------------------

    Guarddata as for your test it is valid but not neccessarily accurate. I think the big thing I would think of as a problem with it is the data cached in memory and most times you would not be doing a loop to INSERT data in a main table so that data may have time to purge from memory and HD reads will occurr, this is especially true for very large tables and servers with heavy loads. I think if you freed the cache and buffers between each loop you might see a difference. I do however agree that prechecking is a good way to deal with potential errors so you can handle your own way. But that means an extra read per process that in many causes is not needed. I prefer letting the error be thrown back unless I need to do something special with data that would cause an error. If all you are going to do is tell the client an error occurred then the extra read is actually going to hurt overall performance.

    I know others will disagree but consider

    50000 records that would be fine have to do an initial SELECT to validate then INSERT or UPDATE. But then along comes one that will fail and you see this in checking and return an error. And this is you traffic roughly every hour.

    Perofrmance wise not doing the check will prevent 50000 extra reads the one fails do to not unique it will throw an error which the client will handle. Again error logic for non-specific needs is far better in the client on an after condition than doing a check and perform.

    Even on a system that makes 10 changes a day you still are hurting performance even thou it is on a small scale.

  • Having 4000+ posts doesn't mean I'm always right either? Bummer!

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • I'm saving these two pages in my personal collection. I think I've saved hundreds of internet pages over a couple of years now just because I was afraid that they would dissapear. Hm, if I made a search engine it would propably be bigger than Google

    /Tomi

  • Don't worry Tomi - I have the greatest respect for Antares, Andy and a host of others.

    Antares - you are right about the extra reads and that is an important consideration. However, even if I change the test to fail only once in 1000, checking first is still faster. Once in 50000 would probably be lower as you said.

    This discussion is probably more applicable to Foreign Key Constraints than Unique keys. We found that removing the FK and adding the code to our stored procedures improved our throughput tremendously. Probably also has a little to do with the data.

    Anyway - this little horse is too beat for me. My apologies for muddying the waters.

    Guarddata-

  • Thank you GuardData for the kind words.

    And I don't think you muddied the water at all. You gave a valid point and I just countered. Your solution is common amoung many developers and I have often questioned does checking first this way have all upsides or are there downsides and what might they be. Also, again many choices are purely personal in style and processing methods but none-the-less valid and have potential for good DB management and bad.

    Always contribute because as a whole we learn that way.

  • quote:


    Hey Chris, relaized I had lost you thread along the way. I check around and had come across this http://support.microsoft.com/default.aspx?scid=kb;en-us;322002

    However I think this was an MS choice not really compliance with SQL-99 that causes "Unique Constraints" on SQL to behave this way.


    Antares, thanks for that KB, strange that I hadn't found it the last time we where discussing this. Although it doesn't state why MS chose to go against the standard, it does show some nice ways to 'fake' a (standard) unique constriant. I especially liked the second solution (computed column) since it allows me to actually define a unique constraint in the logical design level.

    Regarding this discussion, anyone have any comments on my thoughts regarding this not being a performance issue, but rather an issue of designing a correct logical design?

    --

    Chris Hedgate @ Apptus Technologies (http://www.apptus.se)

    http://www.sql.nu

  • quote:


    Regarding this discussion, anyone have any comments on my thoughts regarding this not being a performance issue, but rather an issue of designing a correct logical design?


    I would say correct and incorrect.

    Correct in that it does pertain to logical design of a DB but incorrect in it not being performance related.

    The reason is all indexing choices are performance related. However, does one of these actually and truely have more performance impact than others. If so, specifically what.

    When I suggested that Constraint works like any other constraint being evaluated before the insert takes place then you are getting results faster with less server overhead.

    Proper index choices always are considered more of a performance impact than a design issue.

  • quote:


    Proper index choices always are considered more of a performance impact than a design issue.


    Ah, OK, that I agree with. What I meant was that I don't see the performance differences (whichever is better) as an issue, I would always choose unique constraint because it is a logical construct, thereby showing the users that there is a constraint in place. With only a unique index on the column(s), users would have no idea from looking at the logical schema of a table that there is a restriction on what rows can be inserted. If, as I said in the above referenced discussion, SQL Server would have implemented unique constraints using a unique predicate, without any index used (at least not seen by users), the distinction would have been clear. These are entirely different 'things'. I really don't see any reason to use unique indexes except for the automatic ones used for unique constraints. Someone could of course come up with some exception to this, but generally I wouldn't use 'manually created' unique indexes.

    --

    Chris Hedgate @ Apptus Technologies (http://www.apptus.se)

    http://www.sql.nu

  • Guarddata wrote:

    quote:


    This discussion is probably more applicable to Foreign Key Constraints than Unique keys. We found that removing the FK and adding the code to our stored procedures improved our throughput tremendously. Probably also has a little to do with the data.


    You actually removed the FK constraints? Didn't that mess up your logical model?

    The db that I'm working with at the moment has very few FK constraints (they are implemented in the sprocs) and it's very hard to write reports due to the relationships not being immediately obvious.

  • Just some words of clarification:

    quote:


    mromm wrote:

    A constraint is added to the column definition like this:

    CREATE TABLE table_name (

    abc int NOT NULL

    CONSTRAINT constraint_name UNIQUE CLUSTERED

    )

    An index is created separately from the table definition like this:

    CREATE CLUSTERED UNIQUE INDEX index_name ON table_name( abc )


    Of course you can define the constraint separatly from the table definition as well with the syntax: alter table xxx add constraint...



    Bye
    Gabor

  • Just to clarify for SQL Server 2000,

    You can also provide the filegroup for the primary key constraint at the time you create it. Earlier post indicated you might

    only be able to do this with an index create.

    ALTER TABLE <TABLE_NAME>

    ADD <WITH CHECK|NOCHECK>

    CONSTRAINT <CONSTRAINT_NAME>

    PRIMARY KEY <CLUSTERED|NONCLUSTERED>

    ON <FILEGROUP>

    Also, be aware that if you create a clustered

    constraint or index, it will:

    1) If you omit the filegroup, be placed in the same filegroup as the table's filegroup.

    2) If you specify a filegroup for the PK that is different than the table's filegroup, the table will be moved to the filegroup you specify for the PK.

  • I don't think there's much difference between the two, because SQL Server actually creates a clustered index to enforce a clustered unique constraint. Create a clustered unique constraint, then query sysindexes by the constraint name. You'll see that an index was created with indid=1, i.e. a clustered index.

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply