Database without ANY primary keys

Question

Database without ANY primary keys

Drammy

SSC-Addicted

Points: 415
More actions
November 19, 2013 at 7:49 am

#280387

Hi all,
I work with a database that has no primary keys defined on any tables. As a customer's consultant I have no control over the supplier's database. It frustrates me that I have no control over the database and I'm looking for some points of discussion so I can go back to the supplier to ask them to review the design and consider making some changes.
So I guess I'm just looking to provoke a discussion about this in order to gather some points I can then discuss with the supplier.
Anyone got any comments...?

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

SQLBlimp SSCertifiable Points: 6340 More actions · Answer 1

Questions:

1. Are there any clustered indexes on the tables? Without them, indexing will be inefficient since each table without a clustered index will be a heap.

2. Are you talking about natural primary keys or setting up IDENTITY columns that will be used as clustered primary keys? If IDENTITY columns exist then you are one step closer to making efficient indexing.

Sometimes vendors treat SQL Server as if it were a kind of set-and-forget file system. That's not what it is.

Perhaps we can help you educate your client.

Thanks

John

Lowell SSC Guru Points: 323755 More actions · Answer 2

it might be the product of someone who thought to keep their database design as if it were secret and proprietary, and have no PK's so you can't deduce the design by following Foreign Key relationships.

if that's true you could probably run a trace when the application is running, parse the JOIN criteria of any statements, and infer the PK/FK's to create.

I've seen that quite a few times here on SSC, where someone thinks their database design, and the relationship between Invoice and InvoiceDetails was unique to their thought pattern and no one else would think of it.

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Drammy SSC-Addicted Points: 415 More actions · Answer 3

Thanks, perhaps a bit more background is in order.

The database was originally built years ago in Progress (old OpenEdge) but some Clients run it through the OpenEdge SQL DataServer which basically migrates the schema and data into SQL and then uses a schema placeholder to translate OpenEdge database queries into SQL queries. This means that the database can run on SQL whilst the main GUI and interface doesn't need re-coding and can continue to run in Progress 4GL.

There genuinely are no (clustered) primary keys and very few clustered keys; most indices are non-clustered indices that yes will just exist on the heap. It's not a case of them trying to keep the db a secret as it is fairly self-explanatory; I used to work for the supplier and know the db inside out.

I'd like an idea for what kind of impact having no primary, clustered keys at all would have on your typical transactional database...

SQLBlimp SSCertifiable Points: 6340 More actions · Answer 4

Drammy (11/20/2013)
Thanks, perhaps a bit more background is in order.
(snip)
I'd like an idea for what kind of impact having no primary, clustered keys at all would have on your typical transactional database...

A lot. The heap would be a big slow-down.

Is there an identity-type column (int or bigint) on any table?

You can likely change the underlying indexing to clustered and rebuild all indexes, and gain performance, without impacting the app. Can you set up a test environment to verify this and to regression test before running such a script on production? You may not be able to specify PK in the DDL, since that brings restrictions along with it. But you can still create an unique clustered index on the natural primary key and achieve the identical effect.

Obviously, with no defined PK, features like replication are not going to be a future option. If there is an unique identity key (a sequential integer field), you can cluster that column and reap maximum indexing benefits, especially since inserts will always go to the end of the cluster.

Then once you get indexes clustered on all tables, go find Michelle Ufford's dba_indexdefrag freeware and schedule yourself a weekly index rebuild job that runs on a weekend schedule. If you wish, another step to the same job could update statistics, and you would reap best speed and keep your DB nice and clean.

Thanks

John.

Grant Fritchey SSC Guru Points: 399379 More actions · Answer 5

Plus the fact that no unique constraints radically changes the optimizers choices for retrieving data. I'll bet you see a LOT more scans than you should since the optimizer has no way of knowing if your data is unique.

Honestly, I'd be hard pressed to list the very large number of issues you're seeing here that you shouldn't in terms of performance. But the biggest worry would be data quality. Your business can quite easily insert duplicate data if there are no constraints in place (and probably has). That would be the first, and most important, point I'd raise.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Drammy SSC-Addicted Points: 415 More actions · Answer 6

Thanks for all the guidance and help.

I have plenty to be going back to them with and might even change some indexing on the primary tables and prove the performance improvement as part of my argument.

Thanks again,

Drammy

aaron.reese SSChampion Points: 13498 More actions · Answer 7

Are you talking about application performance or reporting performance.

If it is reporting, then I woudl be tempted to replicate the database and create reporting indexes. In a large system the differing demands on OLTP and OLAP can cause conflicts as lots of indexes to support reporting will slow down the insert and update of records.

SQLBlimp SSCertifiable Points: 6340 More actions · Answer 8

aaron.reese (12/11/2013)
Are you talking about application performance or reporting performance.
If it is reporting, then I would be tempted to replicate the database and create reporting indexes. In a large system the differing demands on OLTP and OLAP can cause conflicts as lots of indexes to support reporting will slow down the insert and update of records.

That is a good idea, but SQL Server replication requires defined primary keys on all articles. Therefore, if reporting requires data in close to realtime, the PK issue that the original poster raised persists.

JT

Drammy SSC-Addicted Points: 415 More actions · Answer 9

Yeah, understood - this is an application database only. No reporting requirements or indices on this database.

jshahan SSCarpal Tunnel Points: 4622 More actions · Answer 10

Where's Celko? I thought he'd be in here administering some kind of beat down based on the title alone. 🙂

aaron.reese SSChampion Points: 13498 More actions · Answer 11

Happy Christmas Joe 😛

Which reminds me.

Q) Why do Assembly programmers get Xmas and Halloween confused

A) Because OCT31 = DEC25

Anyway, JC wouldn't touch Progress with a barge pole - come to think of it, neither would I

JC - maybe we should be nicer to him - it might also be his birthday

jshahan SSCarpal Tunnel Points: 4622 More actions · Answer 12

Don't we celebrate JC's birthday on the 25th regardless of when it actually is?