March 10, 2017 at 9:02 am
Eric M Russell - Friday, March 10, 2017 6:38 AMGilaMonster - Friday, March 10, 2017 12:01 AMGrant Fritchey - Thursday, March 9, 2017 6:25 AMI've built systems where I only had the ID columns on every table. I'm that guy. Hopefully, I've learned better now. While I may not want to use a natural key as the PK on a table and it might be a wretched bad choice for the clustered index, I'm still going to have that constraint there to ensure I have clean data because I've done the dirty data thing and sincerely regret it.Several years ago I was involved in a project to create a 'single point of truth' at one of the banks. Hardest part - unifying multiple different sets of customer data, most without any unique constraint other than the pk. The bank had no way of telling that a person with a cheque account was the same person who had a credit card, etc. Hence they regularly used to call people with one account to try and sell them other services, and annoy their customers because the person called already had that service.
Practically all of my tables have both a surrogate key (a clustered identity column) and a natural key (unique non-clustered index), and the surrogate key is the primary key used for foreign key relationships. There may be occasions, something like an event log, where a timestamp combined with another column like DeviceID or PointOfSaleID can serve as a natural primary key without the need for a surrogate key. One issue with using a natural key for the primary key is that the columns composing the natural key may be mutable over time, especially if we're talking about demographic attributes. For example, people change their phone number or email address, or the person's date of birth was entered wrong and then needs to be re-entered correctly at a subsequent visit.
Have you ever looked at all index stats on the identity-clustered tables -- usage stats, missing index stats and operational stats -- and tried changing the clus index to be more in line with the actual lookups being done on the table? You might be surprised at the potential performance gains, particularly when joining (very) large tables. Merge joins can be extraordinarily efficient if the tables are keyed to allow them.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 10, 2017 at 9:02 am
Jeff Moden - Thursday, March 9, 2017 7:51 PMYep... me too, Scott. Decades of experience. Just, apparently, different experience and that's the key here. "It Depends".
I admit, I had it in my mind that you were a developer not a DBA. Sorry about that. I guess just different results for each of us.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 10, 2017 at 9:13 am
ScottPletcher - Friday, March 10, 2017 9:02 AMJeff Moden - Thursday, March 9, 2017 7:51 PMYep... me too, Scott. Decades of experience. Just, apparently, different experience and that's the key here. "It Depends".I admit, I had it in my mind that you were a developer not a DBA. Sorry about that. I guess just different results for each of us.
How would job title make a difference here? Job title has no bearing on understanding sql server at a deep level. Having a title of DBA does not automatically make you credible from a database perspective.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 10, 2017 at 10:44 am
Sean Lange - Friday, March 10, 2017 9:13 AMScottPletcher - Friday, March 10, 2017 9:02 AMJeff Moden - Thursday, March 9, 2017 7:51 PMYep... me too, Scott. Decades of experience. Just, apparently, different experience and that's the key here. "It Depends".I admit, I had it in my mind that you were a developer not a DBA. Sorry about that. I guess just different results for each of us.
How would job title make a difference here? Job title has no bearing on understanding sql server at a deep level. Having a title of DBA does not automatically make you credible from a database perspective.
Amen to that. I have found in several companies that the Developers know more about DBA stuff and design than the resident DBA does. I've also found many DBAs that know more about Developer stuff than the Developers. As you say, title has no bearing on any of this.
They key is that experience based on different requirements varies greatly amongst us all and that was the point. "It Depends" a whole lot on the type and quantity of the data you're working with. No recommendation is a panacea.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 10, 2017 at 3:10 pm
ScottPletcher - Friday, March 10, 2017 9:02 AMEric M Russell - Friday, March 10, 2017 6:38 AMGilaMonster - Friday, March 10, 2017 12:01 AMGrant Fritchey - Thursday, March 9, 2017 6:25 AMI've built systems where I only had the ID columns on every table. I'm that guy. Hopefully, I've learned better now. While I may not want to use a natural key as the PK on a table and it might be a wretched bad choice for the clustered index, I'm still going to have that constraint there to ensure I have clean data because I've done the dirty data thing and sincerely regret it.Several years ago I was involved in a project to create a 'single point of truth' at one of the banks. Hardest part - unifying multiple different sets of customer data, most without any unique constraint other than the pk. The bank had no way of telling that a person with a cheque account was the same person who had a credit card, etc. Hence they regularly used to call people with one account to try and sell them other services, and annoy their customers because the person called already had that service.
Practically all of my tables have both a surrogate key (a clustered identity column) and a natural key (unique non-clustered index), and the surrogate key is the primary key used for foreign key relationships. There may be occasions, something like an event log, where a timestamp combined with another column like DeviceID or PointOfSaleID can serve as a natural primary key without the need for a surrogate key. One issue with using a natural key for the primary key is that the columns composing the natural key may be mutable over time, especially if we're talking about demographic attributes. For example, people change their phone number or email address, or the person's date of birth was entered wrong and then needs to be re-entered correctly at a subsequent visit.
Have you ever looked at all index stats on the identity-clustered tables -- usage stats, missing index stats and operational stats -- and tried changing the clus index to be more in line with the actual lookups being done on the table? You might be surprised at the potential performance gains, particularly when joining (very) large tables. Merge joins can be extraordinarily efficient if the tables are keyed to allow them.
I'm familiar with all that. Ideally there are no lookups on the table, I try to have covering index(s) for scenarios where user is selecting on something like phone number. But when there are lookups back to the table, I prefer it to be on a single sequential integer column. The same goes for joins between large tables, a single sequential integer column.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
March 11, 2017 at 2:44 am
Eric M Russell - Friday, March 10, 2017 3:09 PMI'm familiar with all that. Ideally there are no lookups on the table, I try to have covering index(s) for scenarios where user is selecting on something like phone number. But when there are lookups back to the table, I prefer it to be on a single sequential integer column. The same goes for joins between large tables, a single sequential integer column.
This is an old article, but it shows how, again, depending on the structure and the types of queries (it all always goes back to Jeff's earlier "it depends" statement), you can seriously out-perform that single sequential integer column.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 6 posts - 46 through 50 (of 50 total)
You must be logged in to reply to this topic. Login to reply