November 25, 2008 at 8:44 am
I'm shortly going to be presented with the challenge of building a new database from scratch. This would be for a new company just starting out, so data migration isn't much of an issue. Most of my SQL experience is as a developer, not as a designer. I have strongly disliked most of the database designs I've worked with thus far, and I definitely don't want one I design to end up the same way. This database is going to be relatively large (I mean, I'm not creating a new Google here, but probably in the neighborhood of 100+ tables). Basically, I'm looking for knowledge, and would love any tips on:
A. Good fundamental books(e-books/phyical books/any other source of info) to read on normalization and table/db design.
B. Ideas for how to go about gathering the information to design the database with the greatest amount of flexibility so it does not need to be heavily modified down the road.
C. Common Pitfalls to avoid.
I'll be looking through some of the posts here as well. Any input is appreciated.
[Edit] After reading a recommendation by Grant Fritchey in another post, I've just ordered Fleming & von Halle's "Handbook of Relational Database Design". Thanks Grant! (And thanks for the wonderful book on execution plans too by the way)
November 25, 2008 at 9:18 am
The best thing I can recommend is to read some books and articles, and look at how people design things. We have a few here and be sure to read the comments to see what people think.
Be sure you implement RI. Think through what "rules" you have to ensure that child rows exist or have parent rows. It's easy to throw something together, then find an exception and decide to remove the FK and then have data issues. Remember when you implement an FK, it has to apply 100% of the time.
November 25, 2008 at 9:52 am
Wow! Thanks!
You're very welcome.
Lookup information on sys.dm_exec_cached_plans and some of the other dynamic management views. I left that out and wish I had it in there now.
Do what Steve says.
In addition, I'd recommend that you make choosing a clustered index a fundamental part of the design. Don't simply assume it should go on the primary key. Also, try to build in some testing time in your design period so that you can bloat what you built with a bunch of test data as a way of verifying the design. That's sometimes hard to do, but try.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 25, 2008 at 10:10 am
In addition, I'd recommend that you make choosing a clustered index a fundamental part of the design.
This is probably one of my own biggest pet peeves. I may not always know the best field for a clustered index, but I do know it's not an identity field that's not even used!
Thanks for the tips guys.
The note about clustered indexing brings to mind something I came across in a section mockup yesterday. When making Foreign key reference tables (IE. Tying an employee to a warehouse. Employee can have multiple warehouses, and multiple Employees can use each warehouse) in a manner such as:
EmployeeID WarehouseID
1 1
1 2
1 3
2 2
2 5
...
Both EmployeeID and WarehouseID would be Foreign keys. Should this table have an identity field and use that as the primary key, then a clustered index / unique constraint on (EmployeeID, WarehouseID), or should it just use these two as the primary key with no Identity field. I'm used to my "identity/rowpointer in every table" training wheels, but it seems like that would be extraneous in cases like this.
Also, is there any good software to help with the design/planning of large scale databases that would be a significant improvement over the database diagram tool in SSMS 2K8?
November 25, 2008 at 10:43 am
Garadin (11/25/2008)
The note about clustered indexing brings to mind something I came across in a section mockup yesterday. When making Foreign key reference tables (IE. Tying an employee to a warehouse. Employee can have multiple warehouses, and multiple Employees can use each warehouse) in a manner such as:EmployeeID WarehouseID
1 1
1 2
1 3
2 2
2 5
...
Both EmployeeID and WarehouseID would be Foreign keys. Should this table have an identity field and use that as the primary key, then a clustered index / unique constraint on (EmployeeID, WarehouseID), or should it just use these two as the primary key with no Identity field. I'm used to my "identity/rowpointer in every table" training wheels, but it seems like that would be extraneous in cases like this.
Also, is there any good software to help with the design/planning of large scale databases that would be a significant improvement over the database diagram tool in SSMS 2K8?
Unless this many-to-many table also defines a second set of structures, and sometimes even then, I would not add another column to act as PK on the table. I'd take those two columns as the PK and, in this case, I'd probably cluster them as well.
For designing databases, working with logical to physical modeling, etc., my favorite tool is still Embarcadero's ERStudio.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply