How to overcome the identity column "jump 1000" issue

Question

How to overcome the identity column "jump 1000" issue

Viewing 6 posts - 46 through 50 (of 50 total)

You must be logged in to reply to this topic. Login to reply

ScottPletcher SSC Guru Points: 101122 More actions · Answer 1

Eric M Russell - Friday, March 10, 2017 6:38 AM
GilaMonster - Friday, March 10, 2017 12:01 AM
Grant Fritchey - Thursday, March 9, 2017 6:25 AM
I've built systems where I only had the ID columns on every table. I'm that guy. Hopefully, I've learned better now. While I may not want to use a natural key as the PK on a table and it might be a wretched bad choice for the clustered index, I'm still going to have that constraint there to ensure I have clean data because I've done the dirty data thing and sincerely regret it.
Several years ago I was involved in a project to create a 'single point of truth' at one of the banks. Hardest part - unifying multiple different sets of customer data, most without any unique constraint other than the pk. The bank had no way of telling that a person with a cheque account was the same person who had a credit card, etc. Hence they regularly used to call people with one account to try and sell them other services, and annoy their customers because the person called already had that service.
Practically all of my tables have both a surrogate key (a clustered identity column) and a natural key (unique non-clustered index), and the surrogate key is the primary key used for foreign key relationships. There may be occasions, something like an event log, where a timestamp combined with another column like DeviceID or PointOfSaleID can serve as a natural primary key without the need for a surrogate key. One issue with using a natural key for the primary key is that the columns composing the natural key may be mutable over time, especially if we're talking about demographic attributes. For example, people change their phone number or email address, or the person's date of birth was entered wrong and then needs to be re-entered correctly at a subsequent visit.

Have you ever looked at all index stats on the identity-clustered tables -- usage stats, missing index stats and operational stats -- and tried changing the clus index to be more in line with the actual lookups being done on the table? You might be surprised at the potential performance gains, particularly when joining (very) large tables. Merge joins can be extraordinarily efficient if the tables are keyed to allow them.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

ScottPletcher SSC Guru Points: 101122 More actions · Answer 2

Jeff Moden - Thursday, March 9, 2017 7:51 PM
Yep... me too, Scott. Decades of experience. Just, apparently, different experience and that's the key here. "It Depends".

I admit, I had it in my mind that you were a developer not a DBA. Sorry about that. I guess just different results for each of us.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Sean Lange SSC Guru Points: 286590 More actions · Answer 3

ScottPletcher - Friday, March 10, 2017 9:02 AM
Jeff Moden - Thursday, March 9, 2017 7:51 PM
Yep... me too, Scott. Decades of experience. Just, apparently, different experience and that's the key here. "It Depends".
I admit, I had it in my mind that you were a developer not a DBA. Sorry about that. I guess just different results for each of us.

How would job title make a difference here? Job title has no bearing on understanding sql server at a deep level. Having a title of DBA does not automatically make you credible from a database perspective.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Jeff Moden SSC Guru Points: 1004409 More actions · Answer 4

Sean Lange - Friday, March 10, 2017 9:13 AM
ScottPletcher - Friday, March 10, 2017 9:02 AM
Jeff Moden - Thursday, March 9, 2017 7:51 PM
Yep... me too, Scott. Decades of experience. Just, apparently, different experience and that's the key here. "It Depends".
I admit, I had it in my mind that you were a developer not a DBA. Sorry about that. I guess just different results for each of us.
How would job title make a difference here? Job title has no bearing on understanding sql server at a deep level. Having a title of DBA does not automatically make you credible from a database perspective.

Amen to that. I have found in several companies that the Developers know more about DBA stuff and design than the resident DBA does. I've also found many DBAs that know more about Developer stuff than the Developers. As you say, title has no bearing on any of this.

They key is that experience based on different requirements varies greatly amongst us all and that was the point. "It Depends" a whole lot on the type and quantity of the data you're working with. No recommendation is a panacea.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Eric M Russell SSC Guru Points: 125562 More actions · Answer 5

ScottPletcher - Friday, March 10, 2017 9:02 AM
Eric M Russell - Friday, March 10, 2017 6:38 AM
GilaMonster - Friday, March 10, 2017 12:01 AM
Grant Fritchey - Thursday, March 9, 2017 6:25 AM
I've built systems where I only had the ID columns on every table. I'm that guy. Hopefully, I've learned better now. While I may not want to use a natural key as the PK on a table and it might be a wretched bad choice for the clustered index, I'm still going to have that constraint there to ensure I have clean data because I've done the dirty data thing and sincerely regret it.
Several years ago I was involved in a project to create a 'single point of truth' at one of the banks. Hardest part - unifying multiple different sets of customer data, most without any unique constraint other than the pk. The bank had no way of telling that a person with a cheque account was the same person who had a credit card, etc. Hence they regularly used to call people with one account to try and sell them other services, and annoy their customers because the person called already had that service.
Practically all of my tables have both a surrogate key (a clustered identity column) and a natural key (unique non-clustered index), and the surrogate key is the primary key used for foreign key relationships. There may be occasions, something like an event log, where a timestamp combined with another column like DeviceID or PointOfSaleID can serve as a natural primary key without the need for a surrogate key. One issue with using a natural key for the primary key is that the columns composing the natural key may be mutable over time, especially if we're talking about demographic attributes. For example, people change their phone number or email address, or the person's date of birth was entered wrong and then needs to be re-entered correctly at a subsequent visit.
Have you ever looked at all index stats on the identity-clustered tables -- usage stats, missing index stats and operational stats -- and tried changing the clus index to be more in line with the actual lookups being done on the table? You might be surprised at the potential performance gains, particularly when joining (very) large tables. Merge joins can be extraordinarily efficient if the tables are keyed to allow them.

I'm familiar with all that. Ideally there are no lookups on the table, I try to have covering index(s) for scenarios where user is selecting on something like phone number. But when there are lookups back to the table, I prefer it to be on a single sequential integer column. The same goes for joins between large tables, a single sequential integer column.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 6

Eric M Russell - Friday, March 10, 2017 3:09 PM
I'm familiar with all that. Ideally there are no lookups on the table, I try to have covering index(s) for scenarios where user is selecting on something like phone number. But when there are lookups back to the table, I prefer it to be on a single sequential integer column. The same goes for joins between large tables, a single sequential integer column.

This is an old article, but it shows how, again, depending on the structure and the types of queries (it all always goes back to Jeff's earlier "it depends" statement), you can seriously out-perform that single sequential integer column.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning