what's the performance difference between unique clustered index and primary key

  • CreateIndexNonclustered wrote:

    Let us work on your comprehension and failures in logic and reasoning then. 

    How about let's work on your ability to state things without inferring meaning?

    You stated "A clustered Primary key is a unique clustered index." That's pretty clear.  If it was a mathematical formula, it would be Unique Index = Primary Key. And that would be incorrect.

    Using your stated logic, "All squares are rectangles, but all rectangles are not squares" means the same as "All clustered primary keys are unique clustered indexes, but all unique clustered indexes are not clustered primary keys".  Nope.  All clustered primary keys are NEVER unique clustered indexes.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • CreateIndexNonclustered wrote:

    Michael L John wrote:

    CreateIndexNonclustered wrote:

    I mean you literally quoted all the constraints I mentioned on it I cannot possibly imagine how you failed to understand what I wrote.

    I read everything you wrote.  A unique index is not the same as a primary key.

    You also stated 'I don't think I would recommend clustering something other than the primary key unless you know exactly what results you are going to get out of it".  That reads as a blanket recommendation that the primary key should be the clustered index.

    You also stated "I dont think there is a justifiable argument to forgo placing a primary key on a table.".  I can argue any number of cases where a primary key should be left off.  They are certainly specific examples, but again, that reads like a blanket recommendation.

    It is a blanket. Excluding a primary key provides no benefit, while excluding it can create myriad problems in the future when one is needed. a four byte integer key uses only 16 Gb of disk space when all values are exhausted. It isn't space or i/o that it is saving. If the table will truly never be related to anything else, then why is it in a relational database engine, the single most costly form of database available, and not in a cheap document database?

    Really.

    Tell me what benefit a primary key provides on a staging table in an ETL process?  You may or may not want to have one.  Again, IT DEPENDS.

    There is only one absolute when it comes to computer systems.  It uses electricity.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Michael L John wrote:

    CreateIndexNonclustered wrote:

    Let us work on your comprehension and failures in logic and reasoning then. 

    How about let's work on your ability to state things without inferring meaning?

    You stated "A clustered Primary key is a unique clustered index." That's pretty clear.  If it was a mathematical formula, it would be Unique Index = Primary Key. And that would be incorrect.

    Using your stated logic, "All squares are rectangles, but all rectangles are not squares" means the same as "All clustered primary keys are unique clustered indexes, but all unique clustered indexes are not clustered primary keys".  Nope.  All clustered primary keys are NEVER unique clustered indexes.

    I didn't infer anything, we aren't discussing math. I would also love to see an example of a non unique clustered primary key.

     

    Michael L John wrote:

    CreateIndexNonclustered wrote:

    Michael L John wrote:

    CreateIndexNonclustered wrote:

    I mean you literally quoted all the constraints I mentioned on it I cannot possibly imagine how you failed to understand what I wrote.

    I read everything you wrote.  A unique index is not the same as a primary key.

    You also stated 'I don't think I would recommend clustering something other than the primary key unless you know exactly what results you are going to get out of it".  That reads as a blanket recommendation that the primary key should be the clustered index.

    You also stated "I dont think there is a justifiable argument to forgo placing a primary key on a table.".  I can argue any number of cases where a primary key should be left off.  They are certainly specific examples, but again, that reads like a blanket recommendation.

    It is a blanket. Excluding a primary key provides no benefit, while excluding it can create myriad problems in the future when one is needed. a four byte integer key uses only 16 Gb of disk space when all values are exhausted. It isn't space or i/o that it is saving. If the table will truly never be related to anything else, then why is it in a relational database engine, the single most costly form of database available, and not in a cheap document database?

    Really.

    Tell me what benefit a primary key provides on a staging table in an ETL process?  You may or may not want to have one.  Again, IT DEPENDS.

    There is only one absolute when it comes to computer systems.  It uses electricity.

    At face value, none. When the ETL requires refactoring down the road requiring implementation of a primary key, and the entire pipeline from immediately upstream and everything downstream needs to be refactored, its value is extreme, while excluding it saves 4-8 bytes per record. 1/64th to 1/32nd the size of the smallest possible i/o size SQL can support.

    I'm not saying a pk is always useful, I am saying that primary keys prevent a lot of potential future challenges at irrelevant cost to any sql server built in the last 10-15 years

  • A PK is not necessarily an irrelevant cost.  If I have 2B rows, for example, the "free" PK costs me at least 16GB of row and storage space.

    Some tables simply don't require a PK.  If you reasonably think one could be needed later, you could go ahead and add it.  But otherwise there's no need to have a PK.

    A PK cannot be altered/rebuilt on the fly, unlike a unique clustered index.  That can cause extra work if you ever have to change the PK.

    As for "always" using an identity column as the PK, that's the worst db practice in the world that somehow is considered "standard" by many people.  Particularly since by default a PK will be clustered if it is built first and you don't specify otherwise.  The vast majority of tables in nearly all dbs should not be clustered by identity.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • It's 16 Gb. My laptop from 10-11 years ago had double that amount of ram. My phone from 2010ish with its 480x800 resolution screen had that amount of storage.

    Maybe you have an example, but I cannot think of a reason why you would need to change a primary key after a table has been created, while leaving the existing table in place. Referential integrity would have to be discarded with no protections ensuring the data remains intact while it is rebuilt.

    If by using an identity as the primary key you mean using a natural key instead - will never agree with any data design that involves natural keys. Use of natural keys only serves to disregard referential integrity. If you mean using a different surrogate like a guid - ok - as long as it is purely oltp and has few range scans. I'd love to have an app like that one day...

     

  • CreateIndexNonclustered wrote:

    Maybe you have an example, but I cannot think of a reason why you would need to change a primary key after a table has been created, while leaving the existing table in place. 

    Things change. Requirements change. Business rules change.

    CreateIndexNonclustered wrote:

    If by using an identity as the primary key you mean using a natural key instead - will never agree with any data design that involves natural keys. Use of natural keys only serves to disregard referential integrity. If you mean using a different surrogate like a guid - ok - as long as it is purely oltp and has few range scans. I'd love to have an app like that one day...

    Please.  Keep designing your databases like this, with hard and fast rules that are always true!!! I've made a boatload of money over the years fixing short-sighted database designs like this.  Thank you for your contribution to my retirement.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • CreateIndexNonclustered wrote:

    It's 16 Gb. My laptop from 10-11 years ago had double that amount of ram. My phone from 2010ish with its 480x800 resolution screen had that amount of storage.

    Maybe you have an example, but I cannot think of a reason why you would need to change a primary key after a table has been created, while leaving the existing table in place. Referential integrity would have to be discarded with no protections ensuring the data remains intact while it is rebuilt.

    If by using an identity as the primary key you mean using a natural key instead - will never agree with any data design that involves natural keys. Use of natural keys only serves to disregard referential integrity. If you mean using a different surrogate like a guid - ok - as long as it is purely oltp and has few range scans. I'd love to have an app like that one day...

    Way too rigid thinking.

    For example, for a typical U.S. state abbreviation/lookup table, I use the standard 2-char state abbrev as the key.  Zero reason to introduce a surrogate here.  Absolutely zero.  It just needlessly complicates the table.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher wrote:

    CreateIndexNonclustered wrote:

    It's 16 Gb. My laptop from 10-11 years ago had double that amount of ram. My phone from 2010ish with its 480x800 resolution screen had that amount of storage.

    Maybe you have an example, but I cannot think of a reason why you would need to change a primary key after a table has been created, while leaving the existing table in place. Referential integrity would have to be discarded with no protections ensuring the data remains intact while it is rebuilt.

    If by using an identity as the primary key you mean using a natural key instead - will never agree with any data design that involves natural keys. Use of natural keys only serves to disregard referential integrity. If you mean using a different surrogate like a guid - ok - as long as it is purely oltp and has few range scans. I'd love to have an app like that one day...

    Way too rigid thinking.

    For example, for a typical U.S. state abbreviation/lookup table, I use the standard 2-char state abbrev as the key.  Zero reason to introduce a surrogate here.  Absolutely zero.  It just needlessly complicates the table.

    Agree. Another example. FIPS codes. ISO standard codes for many things.  The list is large. Why reinvent the wheel.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • ScottPletcher wrote:

    CreateIndexNonclustered wrote:

    It's 16 Gb. My laptop from 10-11 years ago had double that amount of ram. My phone from 2010ish with its 480x800 resolution screen had that amount of storage.

    Maybe you have an example, but I cannot think of a reason why you would need to change a primary key after a table has been created, while leaving the existing table in place. Referential integrity would have to be discarded with no protections ensuring the data remains intact while it is rebuilt.

    If by using an identity as the primary key you mean using a natural key instead - will never agree with any data design that involves natural keys. Use of natural keys only serves to disregard referential integrity. If you mean using a different surrogate like a guid - ok - as long as it is purely oltp and has few range scans. I'd love to have an app like that one day...

    Way too rigid thinking.

    For example, for a typical U.S. state abbreviation/lookup table, I use the standard 2-char state abbrev as the key.  Zero reason to introduce a surrogate here.  Absolutely zero.  It just needlessly complicates the table.

     

    And when someone invariably types a state wrong? Or when a state code changes? And whose state codes are you going to choose? Do you use the post office state codes that were formalized in 1963? Do you use the department of commerce codes that were in use until the Reagan administration? Do you use NB for Nebraska which was the post office Nebraska state code from 1963-1969, or do you use NE which has been the post office state code since 1969? When the next system of state codes is released that may not have an obvious connection to the existing state codes, do you use whatever set of codes you are using now to confuse everyone who doesn't understand the connection long after that decision has been made? Or do you take the application down to change it to the new codes?

    I have seen this exact folly in ported legacy applications that used to have a binder for code lookups and how the interpretation of the codes changed - except no one has had that binder since 1999 and there is only one person in the department who remembers what all of them mean.

    Nevermind the inconsistency between using surrogates and natural keys throughout the model.

     

    Michael L John wrote:

    CreateIndexNonclustered wrote:

    Maybe you have an example, but I cannot think of a reason why you would need to change a primary key after a table has been created, while leaving the existing table in place. 

    Things change. Requirements change. Business rules change.

    CreateIndexNonclustered wrote:

    If by using an identity as the primary key you mean using a natural key instead - will never agree with any data design that involves natural keys. Use of natural keys only serves to disregard referential integrity. If you mean using a different surrogate like a guid - ok - as long as it is purely oltp and has few range scans. I'd love to have an app like that one day...

    Please.  Keep designing your databases like this, with hard and fast rules that are always true!!! I've made a boatload of money over the years fixing short-sighted database designs like this.  Thank you for your contribution to my retirement.

    There is no business rule that can redefine the relationship between entities. Two entities don't just change their mind one day and decide they are going to relate to each other differently. A requirement change may change the entities in an application, but creating new entities and new relations, vs changing the relations between existing entities is not the same thing.

    And yes I will keep designing databases with surrogate keys. My databases will always maintain their referential integrity no matter what requirement changes happen.

     

    re: fairly standard codes - what do you do when the specification for rjindael becomes AES? What about when something like a time zone changing names and a framework decides it is a new name for the same time zone and not a new time zone?

  • CreateIndexNonclustered wrote:

    ScottPletcher wrote:

    CreateIndexNonclustered wrote:

    It's 16 Gb. My laptop from 10-11 years ago had double that amount of ram. My phone from 2010ish with its 480x800 resolution screen had that amount of storage.

    Maybe you have an example, but I cannot think of a reason why you would need to change a primary key after a table has been created, while leaving the existing table in place. Referential integrity would have to be discarded with no protections ensuring the data remains intact while it is rebuilt.

    If by using an identity as the primary key you mean using a natural key instead - will never agree with any data design that involves natural keys. Use of natural keys only serves to disregard referential integrity. If you mean using a different surrogate like a guid - ok - as long as it is purely oltp and has few range scans. I'd love to have an app like that one day...

    Way too rigid thinking.

    For example, for a typical U.S. state abbreviation/lookup table, I use the standard 2-char state abbrev as the key.  Zero reason to introduce a surrogate here.  Absolutely zero.  It just needlessly complicates the table.

    And when someone invariably types a state wrong? Or when a state code changes? And whose state codes are you going to choose? Do you use the post office state codes that were formalized in 1963? Do you use the department of commerce codes that were in use until the Reagan administration? Do you use NB for Nebraska which was the post office Nebraska state code from 1963-1969, or do you use NE which has been the post office state code since 1969? When the next system of state codes is released that may not have an obvious connection to the existing state codes, do you use whatever set of codes you are using now to confuse everyone who doesn't understand the connection long after that decision has been made? Or do you take the application down to change it to the new codes?

    I have seen this exact folly in ported legacy applications that used to have a binder for code lookups and how the interpretation of the codes changed - except no one has had that binder since 1999 and there is only one person in the department who remembers what all of them mean.

    Nevermind the inconsistency between using surrogates and natural keys throughout the model.

    Michael L John wrote:

    CreateIndexNonclustered wrote:

    Maybe you have an example, but I cannot think of a reason why you would need to change a primary key after a table has been created, while leaving the existing table in place. 

    Things change. Requirements change. Business rules change.

    CreateIndexNonclustered wrote:

    If by using an identity as the primary key you mean using a natural key instead - will never agree with any data design that involves natural keys. Use of natural keys only serves to disregard referential integrity. If you mean using a different surrogate like a guid - ok - as long as it is purely oltp and has few range scans. I'd love to have an app like that one day...

    Please.  Keep designing your databases like this, with hard and fast rules that are always true!!! I've made a boatload of money over the years fixing short-sighted database designs like this.  Thank you for your contribution to my retirement.

    There is no business rule that can redefine the relationship between entities. Two entities don't just change their mind one day and decide they are going to relate to each other differently. A requirement change may change the entities in an application, but creating new entities and new relations, vs changing the relations between existing entities is not the same thing.

    And yes I will keep designing databases with surrogate keys. My databases will always maintain their referential integrity no matter what requirement changes happen.

    re: fairly standard codes - what do you do when the specification for rjindael becomes AES? What about when something like a time zone changing names and a framework decides it is a new name for the same time zone and not a new time zone?

    I guess if you fantasize that UPSP standards aren't standards, you would have to use a number.  Everyone in the country should know that state TX represents.  What is state 24 or 33?  No one knows, only you.  That's just a horrible design idea for a table like that.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Michael L John wrote:

    ScottPletcher wrote:

    CreateIndexNonclustered wrote:

    It's 16 Gb. My laptop from 10-11 years ago had double that amount of ram. My phone from 2010ish with its 480x800 resolution screen had that amount of storage.

    Maybe you have an example, but I cannot think of a reason why you would need to change a primary key after a table has been created, while leaving the existing table in place. Referential integrity would have to be discarded with no protections ensuring the data remains intact while it is rebuilt.

    If by using an identity as the primary key you mean using a natural key instead - will never agree with any data design that involves natural keys. Use of natural keys only serves to disregard referential integrity. If you mean using a different surrogate like a guid - ok - as long as it is purely oltp and has few range scans. I'd love to have an app like that one day...

    Way too rigid thinking.

    For example, for a typical U.S. state abbreviation/lookup table, I use the standard 2-char state abbrev as the key.  Zero reason to introduce a surrogate here.  Absolutely zero.  It just needlessly complicates the table.

    Agree. Another example. FIPS codes. ISO standard codes for many things.  The list is large. Why reinvent the wheel.

    And SCAC (shipper) codes.  Yes, sometimes they do change, but you have to update them anyway.  It'd be idiotic to assign an id to a SCAC code.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • CreateIndexNonclustered wrote:

    There is no business rule that can redefine the relationship between entities. Two entities don't just change their mind one day and decide they are going to relate to each other differently. A requirement change may change the entities in an application, but creating new entities and new relations, vs changing the relations between existing entities is not the same thing.

    Again, never.  Your world must be very small.

    Here is an example.  We designed a system for transporting ADA eligible persons.  The initial rules were that the "clients reserved trip on a single vehicle type".  However, the rules on moving a person from point A to B at a given time changed.  The need was to allow the people to move from A, to B, C, D, and back to A, on multiple transportation modes.  A single point A to B entity (Trips) no longer worked. The relationships to the other entities needed to change.  The PK on this table was no longer a single  column "trip number".

    That's one example.  Yes, relationships do not change often.  But it does happen.  I have designed systems for many different industries.  The most common things that have forced changes were new regulations, reporting requirements, and BI.  We are working on a re-write of a medical system now.  Many of the entities no longer meet the various needs.

    The entire reason that both Scott and myself have criticized you is your insistence that there is one way, and one way only.  Yes, the systems you have worked on may fit the things you are adamant about perfectly.  But most of the systems I have worked on in nearly 30 years do not.

     

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Sorry, guys. Let me tell u what brought this up. We have several load processes where we are truncating and reloading large tables. These table are used for one purpose, so we know what it should be clustered on and this key will always be unique and not contain nulls.. The reason I asked about the primary key is because we typically create a clustered unique index on the relevant columns. Not a primary key. So I was concerned if we should be creating the index as a primary key as opposed to simply using a unique index. It was my understanding that they were the same. I realize the primary key doesn't allow nulls. I just didn't know if there was another purpose for the primary key under the hood of sql server. Sounds as though there isn't. I think we are going to change our processes to create these as primary keys as opposed to simple adding a unique key going forward just as good practice.

    To be honest,  I believe we were creating the unique key simply because we are scripting the keys creation after the load to the particular table and the below seemed simpler than the create unique key script was simpler that the alter table script for the primary key creation. I'm going to change our practice to use the primary key just to be safe.  These load tables typically only need one index. One clustered unique index and the key data is not going to be null.

    drop table if exists #temp

    select 1 id

    into #temp

    this is what we are currently doing

    create unique clustered index index1 on #temp(id)

    as opposed to

    ALTER TABLE #temp ADD CONSTRAINT [PK_Individual_CustomerID] PRIMARY KEY CLUSTERED

    (

    id ASC

    )

  • Snargables wrote:

    Sorry, guys. Let me tell u what brought this up. We have several load processes where we are truncating and reloading large tables. These table are used for one purpose, so we know what it should be clustered on and this key will always be unique and not contain nulls.. The reason I asked about the primary key is because we typically create a clustered unique index on the relevant columns. Not a primary key. So I was concerned if we should be creating the index as a primary key as opposed to simply using a unique index. It was my understanding that they were the same. I realize the primary key doesn't allow nulls. I just didn't know if there was another purpose for the primary key under the hood of sql server. Sounds as though there isn't. I think we are going to change our processes to create these as primary keys as opposed to simple adding a unique key going forward just as good practice.

    To be honest,  I believe we were creating the unique key simply because we are scripting the keys creation after the load to the particular table and the below seemed simpler than the create unique key script was simpler that the alter table script for the primary key creation. I'm going to change our practice to use the primary key just to be safe.  These load tables typically only need one index. One clustered unique index and the key data is not going to be null.

    drop table if exists #temp select 1 id into #temp

    this is what we are currently doing

    create unique clustered index index1 on #temp(id)

    as opposed to

    ALTER TABLE #temp ADD CONSTRAINT [PK_Individual_CustomerID] PRIMARY KEY CLUSTERED ( id ASC )

    For a load table, I think you are safe with either approach. The difference is when there are relationships created between tables.  You would prefer, although not required, to create them using the primary keys.

    I'm curious.  One of the supposed "best practices" is to have a table with no keys or indexes, load the table, and then add the keys and indexes.  The thinking is that the inserts are faster on a "bare" table.  They usually are.  But, I have changed a lot of ETL's to simply truncate the table, leaving the keys and indexes intact, and insert the data.  I've found that the benefit of the faster loads is wasted because the key/index creation takes far longer.  The last one we changed was a table that was 120 GB with abou14 million records. The actual load took 20-30 minutes.  Creating the indexes and keys took 4 hours.  When we did the load with the constraints in place, the load increased to just over an hour.

    Have you every done any testing of these?  You may find one table works best one way, and another works best in another way.

     

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Typically you create only the clustered index prior to load, and any non-clustered indexes after the load.  That's generally the best-performing approach, although theoretically it could be different for a specific table(s).

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply