An Identity Crisis

Question

An Identity Crisis

cs_troyk

SSCertifiable

Points: 5383
More actions
March 16, 2005 at 10:19 pm

#362242

Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/tKetsdever/anidentitycrisis.asp

Viewing 15 posts - 1 through 15 (of 35 total)

You must be logged in to reply to this topic. Login to reply

G.R.Prithiviraj Kulasingham SSCarpal Tunnel Points: 4105 More actions · Answer 1

All data types have their limits. This crisis is nothing specific to Identity columns. It is related to lack of upfront planning.

Additionally, Microsoft always recommends us to use Information_Schema views and functions to get system data. IDENT_CURRENT('table_name') will return the current identity value (if available. If the table doesn't have an identity column, the return value is NULL).

It looks like the author has done some changes to system entries to make the column BigInt. Unless he shares something on that area, this article is not useful.

Cheers,
Prithiviraj Kulasingham

http://preethiviraj.blogspot.com/

Malcolm Leach SSCommitted Points: 1517 More actions · Answer 2

I agree - the title of this article should have been 'The Importance Of Choosing Realistic Data Types When Designing Your Schema'. I'd seriously consider 're-allocating' whoever designed the table with 'int' in the first place as they clearly have no idea about the business.

The best designers/DBAs are ones that fundamentally understand what their company's business is all about because it means that they will intrinsically factor that knowledge into every stage of the design process.

Identity columns are simply an efficient way of generating unique keys and I'm a little tired of seeing them demonized when the real issues lie elsewhere.

Malcolm
DB Ghost - Build, compare and synchronize from source control = Database Change Management for SQL Server
www.dbghost.com

Jane Matheson-154829 SSCertifiable Points: 5522 More actions · Answer 3

Whether you want to argue that the DB shouldn't have been designed that way is a mute point. DBAs often have to clean up what others do and sometimes even make mistakes themselves - or least all the DBAs I know have at some point made at least 1 mistake. Things that are easy are often overlooked and I think because identity is easy, this situation is not unrealistic. I thought the article was a concise and easy to read warning to be wary of identities lurking in your DB because they are defined on integers.

.

Malcolm Leach SSCommitted Points: 1517 More actions · Answer 4

I completely disagree - are you really suggesting that use of identity columns is somehow a greater evil that not designing your database properly?? I'm not suggesting that DBAs become flawless human beings and never make mistakes but designing a new schema or table isn't something that you do hundreds of times a day. It is something you do relatively infrequently and, because it can have large consequences down the road, should be given the appropriate time and effort to get right. I also cannot believe that you are using the ease of defining an identity as a negative point.

Would you rather that Microsoft remove the feature and just let developers hand code a method of doing it? Of course not.

The issue described is purely about the column datatype chosen not how the value is generated. The identity column is not the problem the datatype is. Simple as that.

I appreciate that we have to clean up after other people mistakes, god knows I've seen more than enough of them, but to suggest looking only at identity columns is ridiculous. What if the column is an integer and it's value is calculated by doing some MAX() + 1 function (not an approach anyone would recommend yet often employed). The issue still exists i.e. the column will run out of space to store the number and yet will never be picked up by monitoring columns which have the identity attribute defined.

This article should have concluded by recommending that DBAs simply monitor integer primary keys and the current maximum values. That way, whatever method is used to arrive at the arbitrary value, you will still pick up on a disaster waiting to happen.

Malcolm
DB Ghost - Build, compare and synchronize from source control = Database Change Management for SQL Server
www.dbghost.com

ubi_comp SSC Journeyman Points: 97 More actions · Answer 5

Excellent Article Troy,

I was looking for some way to get *all* the identity columns and their values. Ended up reading about SQLDMO, but suddenly your article made it simple and quick to get all the identity column names, their table names and values...

Thanks!!

Jane Matheson-154829 SSCertifiable Points: 5522 More actions · Answer 6

I understand your point. But, I think you were harsh on the writer. I appreciated his point of view.

Aaron Myers Ten Centuries Points: 1139 More actions · Answer 7

Agreed...it would have been interesting to see his schema/logic modifications and cleanup sequence. As is, the article isn't very useful.

Malcolm Leach SSCommitted Points: 1517 More actions · Answer 8

Sorry Jane I wasn't trying to slate the entire article per se, (honest!) - just the title and conclusion. It clearly has some useful points regarding the easy reporting of identity columns etc. and the general theme of averting disaster before it happens is laudible.

It's just that I have heard too many conversations amongst us techies where technical nuances (use of identity) detract from the real underlying issue (wrong datatype). I wish people would always attempt to understand the root causes of the many problems that plague us in our day to day jobs and not just try to firefight the symptoms.

Cure The Problem Not The Symptoms...er...Amen

Malcolm
DB Ghost - Build, compare and synchronize from source control = Database Change Management for SQL Server
www.dbghost.com

Vic Kirkpatrick-173212 SSCommitted Points: 1831 More actions · Answer 9

The problem that the author had to fix had nothing to do with identity columns. This could have been a regular int column that got its new numbers via time-tested "max(colval) + 1" and it still would have run into the same issue. It's funny how identities have somehow gotten this bad rap. I've seen it everywhere I've gone in my career. In reality, identity columns are awesome when the DBA knows how to design them into the schema to match the business logic and expected record-load. They save a ton of index-scans (or table scans for some DBAs who use poorly designed indexes).

On the flip side, it is good the author featured a number of ways to list the identity columns in the database, which is helpful. The fix must have been easy, right? I mean, you can alter an int identity up to a bigint in one statement.

-Vic

David Bird SSCarpal Tunnel Points: 4669 More actions · Answer 10

I've used the COLUMNPROPERTY to Identify tables with Identity columns.

SELECT Distinct Table_Name,MAX(COLUMNPROPERTY (OBJECT_ID(Table_Name),Column_Name,'IsIdentity')) as Id

FROM INFORMATION_SCHEMA.COLUMNS (NOLOCK)

GROUP BY Table_Name

David Bird

Peter DeBetta SSC Veteran Points: 286 More actions · Answer 11

Hear! Hear!

Microsoft recommends that you use the Information Schema views instead of the system tables when possible. To quote SQL Server 2000 Books Online: "To obtain meta data, use system stored procedures, system functions, or these system-supplied [Information Schema] views only. Querying the system tables directly may not provide accurate information if system tables are changed in future releases."

Also, I noticed that schema (object ownership) was not addressed and if a table exists with multiple owners (Bob.MyTable, Jane.MyTable, dbo.MyTable) then the IDENT_CURRENT results are inaccurate. In order to utilize the Information Schema views and account for the multiple table ownership, you need to query as follows:

SELECT T.TABLE_SCHEMA, T.TABLE_NAME, C.COLUMN_NAME, C.DATA_TYPE,
  IDENT_CURRENT(T.TABLE_SCHEMA + '.' + T.TABLE_NAME) [CURRENT_IDENTITY_VALUE]
FROM INFORMATION_SCHEMA.TABLES AS T (NOLOCK)
  INNER JOIN 
    (SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, DATA_TYPE
     FROM INFORMATION_SCHEMA.COLUMNS (NOLOCK)
     GROUP BY TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, DATA_TYPE
     HAVING MAX(COLUMNPROPERTY (OBJECT_ID(Table_Name),Column_Name,'IsIdentity')) = 1) AS C
  ON C.TABLE_SCHEMA = T.TABLE_SCHEMA AND C.TABLE_NAME = T.TABLE_NAME

Later,

Peter

cs_troyk SSCertifiable Points: 5383 More actions · Answer 12

Hi Malcolm;

Your comments (and everyone else who expressed similar comments) about this not being an Identity-specific problem are absolutely correct. I focused on the Identity columns in particular because I found it interesting that there were multiple ways of getting at the same piece of metadata. And now, with some of the other responses, I have a couple more ways to find Identities.

It is also correct to say that a comprehensive look at the problem of inappropriate datatype choices is beyond the scope of the article's conclusion. I can see that perhaps I should have been more general in that respect, but, hey, this is my first crack at a tech article.

So I'm interested to further discuss anyone's ideas about how this could be generalized a bit. Sticking with integers only, what would you do if you were a consultant brought in to "check the health" of an enterprise system? How would we do a comprehensive check of integer values accross all tables? How would we focus in on those tables most likely to cause problems? Is it something that could be done in a single script, or would we need to do some kind of script->code generation->script solution?

I'll give this some thought and post ideas later this evening. I look forward to seeing what others come up with, too!

TroyK

David Lu SSC Eights! Points: 882 More actions · Answer 13

Excellent! Now I know how to find that kind field quickly with 3 or more options.

In my mind, I hate to use identity 1. you must have other alternative key to ensure the uniquness of the row, otherwise, you could have a million same contents records except the identity column. 2. however, it might be useful in a situation that there are 3 or more concatenated pk. then, an identity field is ok. In summary, my max idenetity column is 34 million, still have room to grow. but to change 2300 databases from int to bigint to stop the 7days operation might be a big issue. In addition, the FK is also a issue.

Good job, Troy Ketsdever

-D

Adam Machanic SSCoach Points: 15370 More actions · Answer 14

Vic,

Using max(colval) + 1 is probably not a good idea. If two sessions hit it simultaneously, you're going to get a collision unless you serialize access. So you need to choose between scalability and either duplicate rows or constraint violations. That's one of the main benefits that IDENTITY provides.

--
Adam Machanic
whoisactive