T-SQL Help Needed

Question

T-SQL Help Needed

Viewing 15 posts - 16 through 30 (of 126 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 1

Greg Snidow (9/12/2012)
Jeff Moden (9/12/2012)
Greg Snidow (9/11/2012)
The 'id' column in every table was a bit annoying to work with.
It's even more annoying when you need it and it's not there. 😉
Don't overlook the serious performance advantages for having an "ID" column in virtually every table can have. The biggest advantage is that make the perfect choice for clustered indexes especially for large tables that take a lot of inputs on a regular basis. "ID" columns meet all of the criteria for good clustered indexes... Unique, Narrow, and "ever increasinng". You also have to remember that the clustered index, whatever it is, becomes an integral part of every non-clustered index you end up with whether you want it to be or not.
Oh, I could not agree with you more, Jeff. All I meant was that it is confusing when every table has a column named 'id', as opposed to 'table_name_id', or something to that effect. I have one in all of my tables. I guess it's just what one is used to seeing.

Ah... understood and agreed. Again, that's why I had "ID" in quotes. There should always be a descriptive prfix on such "ID" columns (and I also usually avoid underscores). See my previous post.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

aaron.reese SSChampion Points: 13498 More actions · Answer 2

Celko is not necessarily wrong in his answers, bu this attitude stinks. we all had to start somewhere and make the Newbie errors.

As for IDs in tables, I (nearly) always use them if I have control over the data. It is much easier to build FKs on the ID number than include multi-part keys and at least I always know it will be unique.

e.g. If you use UserName as a PK and as the FK in another table, what happens when the user wants to change his name, or the user de-registers and a new user wants the same name. MUCH better to use ID and the PK/FK.

The only downsides: If you need to move, restore or merge databases, you MUST retain the original ID numbers. Also the PK has no intrinsic meaning, and you will have to continually refer back to the reference table for the data (e.g. hundreds of joins to the User table to get the user name associated with the ID)

aaron.reese SSChampion Points: 13498 More actions · Answer 3

consitency is the important thing here.

I prefer the PK to be called ID.

TABLE.ID refers to this tables PK

TABLE.SOMETABLE_ID refers to the related tables PK and infers a single field FK

I find TABLE.TABLE_ID and SECONDTABLE.SECONDTABLE_ID annoying. I know it is the ID, I have the table name. putting the tablename into the fieldname eliminates any advantage of aliasing tables, because you still have to change the field list if you change the table being referenced

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 4

aaron.reese (9/13/2012)
consitency is the important thing here.
I prefer the PK to be called ID.
TABLE.ID refers to this tables PK
TABLE.SOMETABLE_ID refers to the related tables PK and infers a single field FK
I find TABLE.TABLE_ID and SECONDTABLE.SECONDTABLE_ID annoying. I know it is the ID, I have the table name. putting the tablename into the fieldname eliminates any advantage of aliasing tables, because you still have to change the field list if you change the table being referenced

Kind of personal... But I think still, most of people here + Microsoft guys who designed sys tables, think that plain ID for PK column name is more annoying :hehe:

And - yes, consistency is important, whatever way you choose - use it across the system/database. It will look much worse if some tables would have ID columns prefixed and some not.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 5

EDIT : Duplicate post.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 6

aaron.reese (9/13/2012)
consitency is the important thing here.
I prefer the PK to be called ID.
TABLE.ID refers to this tables PK
TABLE.SOMETABLE_ID refers to the related tables PK and infers a single field FK
I find TABLE.TABLE_ID and SECONDTABLE.SECONDTABLE_ID annoying. I know it is the ID, I have the table name. putting the tablename into the fieldname eliminates any advantage of aliasing tables, because you still have to change the field list if you change the table being referenced

The logic behind prefixing Id's is perfetly logical, each table is an object, and so each member of that object as a unique Identity so why wouldnt you name it <Object>_Id, Eg Customer_Id on the Customer table or Invoice_Id on the Invoice Table.

In your example you've named your FK column based on the PK Table name so why not name the PK column based on the PK table itself?

_________________________________________________________________________
SSC Guide to Posting and Best Practices

ScottPletcher SSC Guru Points: 101236 More actions · Answer 7

Eugene Elutin (9/12/2012)
Greg Snidow (9/12/2012)
Jeff Moden (9/12/2012)
Greg Snidow (9/11/2012)
The 'id' column in every table was a bit annoying to work with.
It's even more annoying when you need it and it's not there. 😉
Don't overlook the serious performance advantages for having an "ID" column in virtually every table can have. The biggest advantage is that make the perfect choice for clustered indexes especially for large tables that take a lot of inputs on a regular basis. "ID" columns meet all of the criteria for good clustered indexes... Unique, Narrow, and "ever increasinng". You also have to remember that the clustered index, whatever it is, becomes an integral part of every non-clustered index you end up with whether you want it to be or not.
Oh, I could not agree with you more, Jeff. All I meant was that it is confusing when every table has a column named 'id', as opposed to 'table_name_id', or something to that effect. I have one in all of my tables. I guess it's just what one is used to seeing.
I guess it's a pure misunderstanding between peers. Jeff, same as myself and many others, defends using IDENTITY "ID" column for PK/Clustered Indexes. I guess most of people here do understand that using just a pure "ID" word as column name is not good practice as it easily became very annoying when you see it in every table. Even MS is following common sense here, it does name columns as object_id, column_id, principal_id, assembly_id etc. in its "sys" tables.
So, there is no arguing for the above. I believe that what OP have posted here was a simplified example of the real structures he has.
However, there is a number of people (at least I know one here), who tells now and then to everyone, that IDENTITY is the evil and never used by SQL developers for defining table keys. The "one I know", doesn't clarify that it is just his own opinion, he says that as a "fact", when it's clearly not. He says that natural keys should always be used, but once I've asked him to tell me what is the natural key for the people? If I have international database what can I use as a Natural Key for People or let say for Companies from different countries? I still don't have the answer. And I don't think I will ever have one as there is no such thing. Even DNA is not 100% unique (that is why forensic DNA-result doesn't confirm "match" or "no-match", it does only gives the probability of them).

You can't be referring to me, because I never said identity should never be used. Identity values have their place, although it's actually very rare that they are genuinely needed as a clustering key (I don't care if they are the PK, as long as they aren't the clustering key).

I said identity should never be the automatic default for all tables. That is a fact. It's not nearly as simplistic, I know, but it's still a fact.

If I create a table to look up state codes ('AL', 'AK', etc.), I use state code as the key, not a meaningless identity value.

If I'm looking up names, I want to look up by name, not by the order the names happened to enter the table.

Otherwise you end up having to create so many covering indexes you're quintupled, at least, your disk space to no real performance gain. And when you add one non-covered column to the query, you get a full table scan instead!!

~70% of the tuning I have to do is getting of the stupid, meaningless clustered key on identity and give the table a clustering key that actually works well.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Greg Snidow SSCoach Points: 16108 More actions · Answer 8

Well, I guess it's safe to say the OP is long gone. Given that he/she posted the problem twice, it must have been pressing. I wonder how it turned out.

Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.

aaron.reese SSChampion Points: 13498 More actions · Answer 9

[
If I create a table to look up state codes ('AL', 'AK', etc.), I use state code as the key, not a meaningless identity value.

fine, but if you are creating a table of product groups (<'BK','Books'>,<'SH','Shoes'>,<'BG','Bags'>) etc.. I would put an ID field on that and use that instead, otherwise if someone decides they need to re-use or change the product group code you're stuffed.

A real world example.

A company I worked for produces clothing catalogues. Each catalogue has a 2 character reference. They produce about 130 catalogues/adverts a year, but not all letters get used (S,Q,I etc) This means they cycle through all the possible combinations within just under 3 years. They did not use an ID on the table and so could not use it as a FK. They now have multi-part keys into almost all their other tables (Brand, Year and catalogue code) which slows down development, testing and reporting as all table joins will require at least three join criteria. The same principle of no ID field is in most of the application and so the same issue arises. As a result, they have some data that it is impossible to correct (e.g. the same product code represents a different colour in two consecutive seasons because the colour table did not use IDs, but a multi-part key based on department and colour and the product changed departements and the same colour had a different colour code in the new departement)

ScottPletcher SSC Guru Points: 101236 More actions · Answer 10

A company I worked for produces clothing catalogues. Each catalogue has a 2 character reference. They produce about 130 catalogues/adverts a year, but not all letters get used (S,Q,I etc) This means they cycle through all the possible combinations within just under 3 years.

That's a poor design from the start. What does that have to do with anything?

Just slapping an identity column on all your tables is not going to solve all your db design problems.

I don't consider it an extreme position stating that:

Serious thought should go into selecting the clustered index key(s)

rather than just following a highly simplistic, database 101 rule about a "unique, short key".

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

ScottPletcher SSC Guru Points: 101236 More actions · Answer 11

but if you are creating a table of product groups (<'BK','Books'>,<'SH','Shoes'>,<'BG','Bags'>) etc.. I would put an ID field on that and use that instead, otherwise if someone decides they need to re-use or change the product group code you're stuffed.

When I go to Amazon to look something up, I prefer seeing categories of "Books", "DVDs", "Shoes" rather than 1283, 17865, 194324 :-).

Again, to me, it depends on the data and how it will be queried.

There is no good one-size-fits-all rules for indexing, esp. clustered indexing.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Lynn Pettis SSC Guru Points: 442470 More actions · Answer 12

ScottPletcher (9/13/2012)
but if you are creating a table of product groups (<'BK','Books'>,<'SH','Shoes'>,<'BG','Bags'>) etc.. I would put an ID field on that and use that instead, otherwise if someone decides they need to re-use or change the product group code you're stuffed.
When I go to Amazon to look something up, I prefer seeing categories of "Books", "DVDs", "Shoes" rather than 1283, 17865, 194324 :-).
Again, to me, it depends on the data and how it will be queried.
There is no good one-size-fits-all rules for indexing, esp. clustered indexing.

I agree with this statement:

I prefer seeing categories of "Books", "DVDs", "Shoes" rather than 1283, 17865, 194324

But that doesn't mean I am going to use "Books" or "DVDs" or "Shoes" to link to the products. There I'd rather have the 1283, 17865, or 194324 value for that.

ScottPletcher SSC Guru Points: 101236 More actions · Answer 13

Lynn Pettis (9/13/2012)
ScottPletcher (9/13/2012)
but if you are creating a table of product groups (<'BK','Books'>,<'SH','Shoes'>,<'BG','Bags'>) etc.. I would put an ID field on that and use that instead, otherwise if someone decides they need to re-use or change the product group code you're stuffed.
Again, to me, it depends on the data and how it will be queried.
There is no good one-size-fits-all rules for indexing, esp. clustered indexing.
I agree with this statement:

?? That's a little confusing. The corollary to that rule has to be that the default clustered index for a table is NOT an identity column -- which you seemed to insist was basically an absolute rule.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Lynn Pettis SSC Guru Points: 442470 More actions · Answer 14

ScottPletcher (9/13/2012)
Lynn Pettis (9/13/2012)
ScottPletcher (9/13/2012)
but if you are creating a table of product groups (<'BK','Books'>,<'SH','Shoes'>,<'BG','Bags'>) etc.. I would put an ID field on that and use that instead, otherwise if someone decides they need to re-use or change the product group code you're stuffed.
Again, to me, it depends on the data and how it will be queried.
There is no good one-size-fits-all rules for indexing, esp. clustered indexing.
I agree with this statement:
?? That's a little confusing. The corollary to that rule has to be that the default clustered index for a table is NOT an identity column -- which you seemed to insist was basically an absolute rule.

I'm sorry, I never said any such thing. The choice of a clustered index is dependent on many things. Sometimes you actually do want the clustered index to be the column with the identity property. There are times you don't. You have to look at numerous factors in order to make the correct decision for an application. How is the data being accessed, how big is the table (width and depth), does our choice of DR/HA methods have a significant impact especially when looking at table/index maintenance, and many others.

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 15

ScottPletcher (9/13/2012)
Eugene Elutin (9/12/2012)
Greg Snidow (9/12/2012)
Jeff Moden (9/12/2012)
Greg Snidow (9/11/2012)
The 'id' column in every table was a bit annoying to work with.
It's even more annoying when you need it and it's not there. 😉
Don't overlook the serious performance advantages for having an "ID" column in virtually every table can have. The biggest advantage is that make the perfect choice for clustered indexes especially for large tables that take a lot of inputs on a regular basis. "ID" columns meet all of the criteria for good clustered indexes... Unique, Narrow, and "ever increasinng". You also have to remember that the clustered index, whatever it is, becomes an integral part of every non-clustered index you end up with whether you want it to be or not.
Oh, I could not agree with you more, Jeff. All I meant was that it is confusing when every table has a column named 'id', as opposed to 'table_name_id', or something to that effect. I have one in all of my tables. I guess it's just what one is used to seeing.
I guess it's a pure misunderstanding between peers. Jeff, same as myself and many others, defends using IDENTITY "ID" column for PK/Clustered Indexes. I guess most of people here do understand that using just a pure "ID" word as column name is not good practice as it easily became very annoying when you see it in every table. Even MS is following common sense here, it does name columns as object_id, column_id, principal_id, assembly_id etc. in its "sys" tables.
So, there is no arguing for the above. I believe that what OP have posted here was a simplified example of the real structures he has.
However, there is a number of people (at least I know one here), who tells now and then to everyone, that IDENTITY is the evil and never used by SQL developers for defining table keys. The "one I know", doesn't clarify that it is just his own opinion, he says that as a "fact", when it's clearly not. He says that natural keys should always be used, but once I've asked him to tell me what is the natural key for the people? If I have international database what can I use as a Natural Key for People or let say for Companies from different countries? I still don't have the answer. And I don't think I will ever have one as there is no such thing. Even DNA is not 100% unique (that is why forensic DNA-result doesn't confirm "match" or "no-match", it does only gives the probability of them).
You can't be referring to me, because I never said identity should never be used...

And I didn't!

I have referred to the ISO-book lover and "many other"-books writer. Who's still remember using (or at least - handling) punch-cards as myself, but not able to appreciate "proprietary" features of MS SQL Server. The one who writes so portable code which can be ported to the future (eg. next century) RDBMS systems. I think many here will recognise who I'm talking about here :hehe:

I really would love to listen to his answers regarding choosing natural keys for Persons and Companies entities (where the system is international and people/companies are not just from USA, you know there some other countries around the globe :-D). I have posted this question few times (hijacking someone else threads), but still have no answer. Looks like I should create the own thread about it... May be he will bite on it :hehe:

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]