A Design Question - To Use an Identity column or not

Question

A Design Question - To Use an Identity column or not

meichner

SSCertifiable

Points: 5563
More actions
June 3, 2010 at 1:53 pm

#225342

I was wondering whether or not I should use an identity column or not in the following senerio:
I have two tables, Sales Order and SalesOrderDisposition. The Sales Order table has a primary key called SOID. The SOID is an identity column. There is a 1 to 1 relationship between a Sales Order and a SalesOrderDisposition.
I need to create the SalesOrderDisposition table. Would I be better off creating an identity column on this table in addition to a SOID that refers to the SalesOrder? I had been told that you should alway use an identity column as a primary key, but in this case it feels like overkill.
Thanks

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 1

If this is truly a one to one relationship, why are you creating a separate table? You should just be adding the disposition column to the sales order table.

With that said, if you have to create a separate table for this - I would not create an identity column on that table. There is no reason for it - and you have a perfectly valid column that is unique to identify the row.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Steve Jones - SSC Editor SSC Guru Points: 741471 More actions · Answer 2

I have done separation like this, vertical partitioning for some data. I don't see an issue with it, and like Jeffrey, I wouldn't add an identity to this. I'd just move over the SOID and make it a primary key.

Primary keys should uniquely identify the row. It has nothing to do with an identity value, and does not have to be a numeric value at all.

meichner SSCertifiable Points: 5563 More actions · Answer 3

Steve Jones - Editor (6/3/2010)
I have done separation like this, vertical partitioning for some data. I don't see an issue with it, and like Jeffrey, I wouldn't add an identity to this. I'd just move over the SOID and make it a primary key.
Primary keys should uniquely identify the row. It has nothing to do with an identity value, and does not have to be a numeric value at all.

I tend to agree with both you and Jeff. I just wanted to be sure before I created the table.

Thanks

PaulB-TheOneAndOnly SSC Guru Points: 66772 More actions · Answer 4

meichner (6/3/2010)
I was wondering whether or not I should use an identity column or not in the following senerio:
I have two tables, Sales Order and SalesOrderDisposition. The Sales Order table has a primary key called SOID. The SOID is an identity column. There is a 1 to 1 relationship between a Sales Order and a SalesOrderDisposition.
I need to create the SalesOrderDisposition table. Would I be better off creating an identity column on this table in addition to a SOID that refers to the SalesOrder? I had been told that you should alway use an identity column as a primary key, but in this case it feels like overkill.

I agree with Jeffrey on this one. A 1-to-1 relationship usually means both tables can be merged into a single one.

In regards to PKs, let me go a little deeper. In general I personally like to rely in natural keys as opposed to surrogate keys. I assume there is a SalesOrderNumber column in the SalesOrder table so, why don't use SalesOrderNumber as the PK?

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

meichner SSCertifiable Points: 5563 More actions · Answer 5

PaulB-TheOneAndOnly (6/4/2010)
meichner (6/3/2010)
I was wondering whether or not I should use an identity column or not in the following senerio:
I have two tables, Sales Order and SalesOrderDisposition. The Sales Order table has a primary key called SOID. The SOID is an identity column. There is a 1 to 1 relationship between a Sales Order and a SalesOrderDisposition.
I need to create the SalesOrderDisposition table. Would I be better off creating an identity column on this table in addition to a SOID that refers to the SalesOrder? I had been told that you should alway use an identity column as a primary key, but in this case it feels like overkill.
I agree with Jeffrey on this one. A 1-to-1 relationship usually means both tables can be merged into a single one.
In regards to PKs, let me go a little deeper. In general I personally like to rely in natural keys as opposed to surrogate keys. I assume there is a SalesOrderNumber column in the SalesOrder table so, why don't use SalesOrderNumber as the PK?

I had been told that all tables should use an identity key. I feel that when someone tells you that you should always do something, that they are usually wrong. No one solution fits all problems. That was the reason for the post. Therefore I am using the Sales Order ID as the primary key. I will not be using an identity field in this table.

I decided to create a separate table because the idea of adding these fields to the sales order didn't occur to me until I posted my question. While I could correct that mistake, I am still not sure if I would be better off creating a separate table. This is because the processes that load this extra data are not required. Therefore if I added in these new fields, they would all contain nulls in the majority of the records.

Due to this would I be better of creating the new table or better off adding the new fields to the Sales Order table?

Thanks

PaulB-TheOneAndOnly SSC Guru Points: 66772 More actions · Answer 6

meichner (6/5/2010)
I had been told that all tables should use an identity key. I feel that when someone tells you that you should always do something, that they are usually wrong. No one solution fits all problems. That was the reason for the post. Therefore I am using the Sales Order ID as the primary key. I will not be using an identity field in this table.

You have my vote on this, when possible I tend to use a natural key. 🙂

meichner (6/5/2010)
I decided to create a separate table because the idea of adding these fields to the sales order didn't occur to me until I posted my question. While I could correct that mistake, I am still not sure if I would be better off creating a separate table. This is because the processes that load this extra data are not required. Therefore if I added in these new fields, they would all contain nulls in the majority of the records.
Due to this would I be better of creating the new table or better off adding the new fields to the Sales Order table?

My personal opinion - when I find a 1-to-1 relationship - is to merge both tables into one meaning, add the new columns to the already existing table.

For existing queries - when retrieving the new data - only needed change is to add new columns to the select list, the alternative is to add a table to FROM clause and appropriate conditions to the predicate... on all queries that are supposed to retrieve the new data.

On top of it... I really don't like to see a 1-to-1 relationship in my model; it feels weird to me 😉

Just my 2 cents.

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

meichner SSCertifiable Points: 5563 More actions · Answer 7

PaulB-TheOneAndOnly (6/5/2010)
meichner (6/5/2010)
I had been told that all tables should use an identity key. I feel that when someone tells you that you should always do something, that they are usually wrong. No one solution fits all problems. That was the reason for the post. Therefore I am using the Sales Order ID as the primary key. I will not be using an identity field in this table.
You have my vote on this, when possible I tend to use a natural key. 🙂
meichner (6/5/2010)
I decided to create a separate table because the idea of adding these fields to the sales order didn't occur to me until I posted my question. While I could correct that mistake, I am still not sure if I would be better off creating a separate table. This is because the processes that load this extra data are not required. Therefore if I added in these new fields, they would all contain nulls in the majority of the records.
Due to this would I be better of creating the new table or better off adding the new fields to the Sales Order table?
My personal opinion - when I find a 1-to-1 relationship - is to merge both tables into one meaning, add the new columns to the already existing table.
For existing queries - when retrieving the new data - only needed change is to add new columns to the select list, the alternative is to add a table to FROM clause and appropriate conditions to the predicate... on all queries that are supposed to retrieve the new data.
On top of it... I really don't like to see a 1-to-1 relationship in my model; it feels weird to me 😉
Just my 2 cents.

Thanks for the response. I will consider combining the two tables into one as you made some very good points.

Before I go about changing anything I just want to re state something so I can be sure of my design going forward. We are saying that its is better to combine these 1:1 relationships into one table even though in the vast majority of rows, these extra fields will be set to null values.

Is that correct?

Thanks

PaulB-TheOneAndOnly SSC Guru Points: 66772 More actions · Answer 8

meichner (6/7/2010)We are saying that its is better to combine these 1:1 relationships into one table even though in the vast majority of rows, these extra fields will be set to null values.
Is that correct?

In my humble opinion: Yes! - with capital "Y".

If you are concerned about space remember that null values on columns sitting on the right side or the row really do not add that much overhead.

If you are concerned about the effect of having Null values on the relational model, null values are a fact of life - we have to deal with them.

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 9

IMHO, it's not a 1:1 relationship if there are a lot of nulls.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

meichner SSCertifiable Points: 5563 More actions · Answer 10

Jeff Moden (6/7/2010)
IMHO, it's not a 1:1 relationship if there are a lot of nulls.

Thanks Jeff. I am not sure where to draw the line between creating one or two tables. I appreciate your input.

meichner SSCertifiable Points: 5563 More actions · Answer 11

PaulB-TheOneAndOnly (6/7/2010)
meichner (6/7/2010)We are saying that its is better to combine these 1:1 relationships into one table even though in the vast majority of rows, these extra fields will be set to null values.
Is that correct?
In my humble opinion: Yes! - with capital "Y".
If you are concerned about space remember that null values on columns sitting on the right side or the row really do not add that much overhead.
If you are concerned about the effect of having Null values on the relational model, null values are a fact of life - we have to deal with them.

Thanks for your input. For some reason creating the one table instead of two just felt wrong. I appreciate your help.

Sioban Krzywicki One Orange Chip Points: 27770 More actions · Answer 12

meichner (6/7/2010)[hr
Before I go about changing anything I just want to re state something so I can be sure of my design going forward. We are saying that its is better to combine these 1:1 relationships into one table even though in the vast majority of rows, these extra fields will be set to null values.

If you are using SQL Server 2008, you can combine the tables into one table and for the fields where the vast majority of values will be null, add the SPARSE setting.

CREATE TABLE Test

(TestID int NOT NULL IDENTITY(1,1),

AlwaysThere varchar(20) NOT NULL,

UsuallyNotThere varchar(25) SPARSE NULL)

If you want to know what % of the column should be NULL before using the SPARSE indicator, this page gives the percentages by data type and an overview of SPARSE columns[/url]

Setting a column to SPARSE means it doesn't use any space for the row when it contains a NULL value, but uses double the space when it does. If it really is the vast majority that'll be NULL, you'll want to use a SPARSE column.

--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams

j-1064772 SSCertifiable Points: 5316 More actions · Answer 13

Steve Jones - Editor (6/3/2010)
I have done separation like this, vertical partitioning for some data. I don't see an issue with it, and like Jeffrey, I wouldn't add an identity to this. I'd just move over the SOID and make it a primary key.
Primary keys should uniquely identify the row. It has nothing to do with an identity value, and does not have to be a numeric value at all.

Why would you want to do this ? It makes the SQL statement moe complicated and a second index has to be maintained. What possible offsetting advantages are there to this ?

With regards to the primary key based on identity: convenient IF this remains hidden from the user (i.e. used only to link two tables). BUT this primary key cannot be used to derive a human-readable sales order number (i.e. SOID = 123, SalesOrderNo = S000123) - unless you do not care for SEQUENTIAL, WITHOUT GAPS, numbering sheme. This would cause a problem in the case of invoices, where the auditors view such gaps with suspicion -- like, what are you trying to hide from us ? You cannot just simply delete an invoice for whatever reason. If an invoice has been issued, the onlyu correction possible is to issue a credit note.

In this case you can still use an internal identity column as the primary key but you cannot use that column to create consecutive invoice numbers. Maybe maintain a separate table holding the last issued invoice number.

meichner SSCertifiable Points: 5563 More actions · Answer 14

Stefan Krzywicki (6/8/2010)
meichner (6/7/2010)[hr
Before I go about changing anything I just want to re state something so I can be sure of my design going forward. We are saying that its is better to combine these 1:1 relationships into one table even though in the vast majority of rows, these extra fields will be set to null values.
If you are using SQL Server 2008, you can combine the tables into one table and for the fields where the vast majority of values will be null, add the SPARSE setting.
CREATE TABLE Test
(TestID int NOT NULL IDENTITY(1,1),
AlwaysThere varchar(20) NOT NULL,
UsuallyNotThere varchar(25) SPARSE NULL)
If you want to know what % of the column should be NULL before using the SPARSE indicator, this page gives the percentages by data type and an overview of SPARSE columns[/url]
Setting a column to SPARSE means it doesn't use any space for the row when it contains a NULL value, but uses double the space when it does. If it really is the vast majority that'll be NULL, you'll want to use a SPARSE column.

I am using SQL Server 2005. Your idea however was interesting. I didn't know about the sparse setting.

Thanks