Buck Woody recently asked a question; how do you design a database. He outlined the process he followed and asked four questions about how each of us do our work:
- What process do you follow?
- How important are the business requirements?
- What tool do you use to create the design, do you need it to diagram, do you even care about diagrams?
- What’s your biggest pain-point about designing?
Funny enough, I haven’t done a full on database design in over a year. My company just finished about 6 years of very hard-core engineering work, designing and building or redesigning and building, the majority of our data input and collection systems. Then, I was doing lots of design work. Now, we’re either largely in maintenance mode for most of those systems, or the few new major projects we’re working on are using CRM, no database design, or hoping that database design will never, ever matter and using nHibernate to build the database on the fly, based on the object model (this, by the way, is still in super-double-secret probation development mode. I haven’t seen what they’re producing). All that means we’re doing very little design work now. That will change.
Process
The process I follow isn’t radically different from Buck Woody’s. I get the business requirements, which are hopefully written on something more substantial than a cocktail napkin, and with more detail than will fit on one, and I read them. Or I meet with the business people and ask lots and lots of questions, which I’ll probably do even if I have written requirements. Sometimes, especially at my current company, I’ll get a full logical diagram from another team. I’ll still basically do the same design work, even if I have a logical model, but I’ll use it as a reference point to double-check my understanding of the requirements. From there:
- Identify the nouns. Figure out what kinds of objects, or things, or widgets that we’re talking about needing to store.
- Figure out the relationships between the widgets. What’s a parent and what’s a child and which of the widgets is related to many other widgets or to one other widget, etc.
- Lay out the attributes for the widget, meaning the columns in the table. Big points here include getting the data type correctly identified and figuring out which of the attributes are required and which are not. Further, which of the attributes come from pick lists, which means look-up tables.
- Identify the natural key. I’m with Buck, most of the time I use an alternate, artificial key (yeah, frequently an identity column), but I want to know the natural key so that I can put a unique constraint on the table, in addition to the primary key. This is a vital, but often missed step, in terms of the business processes being modeled.
- Figure out where I’m going to put the clustered index. Usually, but not always, this will be the primary key, but I do it as a fundamental part of the design. That means, as a fundamental part of the design, I think about the most common access path for the data. That’s going to be where the cluster needs to be.
How Important Are the Business Requirements
Buck, I respect you and I like you and I hate to do this, but you’re kidding with this question, right? No requirements, no database. The requirements are all. The most important reason you want the requirements written down is because that means that business at least thought them through, all the way, one time. It’s your best bet that you’re going to deliver something that slightly resembles what the business wants and needs. I live and breathe by the requirements. When someone from the business corrects my design, “Oh, that widget is related this thingie, not that watchamacallit,” I get them to change the requirements to reflect that new information. This way, when someone wonders why I did what I did, they’ll always be able to see the requirements the way I saw them.
Tools
I use Embarcadero’s ERStudio. I know there are other ER tools on the market, but I fell in love with this one on the day I needed to change the data type on a column that was in about 30 different tables and I did it in a couple of minutes using their scripting tool. I’ve been using it ever since, and we’re talking well more than ten years now. It’s just great. I only use it for the initial design and for documentation after the design. Once the design is done, the first time, and a physical database is constructed, I don’t work from the ER diagram to do builds and deployments, I work from source control. Do I have to do it like this? No, but I really enjoy the power of an ER tool while I’m doing the design because it lets you do a lot of changes, quickly and easily without having to rebuild a database over & over.
Biggest Pain Point
The largest pain point has to be changing requirements. Change happens. I embrace it. I’ve worked on agile projects and I like the general approach and mind set. And yet, changing databases is hard. It’s not so bad when you’re in the strict, isolated, ER diagram only stage, but as soon as you’ve built the database, even one time, change gets difficult. It’s not so bad if you work closely with the design, test & development teams and get their buy-in, early, that test data must be tossed & rebuilt with each fundamental design change. But that’s a hard agreement to get and so you end up spending a lot of time trying to retain the test data AND fundamentally redesign the data structure. This is not an enjoyable combination.
That’s about it. I would say the one thing I try to do, and it’s not easy, is be open to doing silly stuff. I try, and I don’t always succeed, to let the business or developers or logical modelling team make silly choices after I carefully tell them why they’re silly and the likely outcome. I do this because fighting them on every single silly decision is a losing proposition. Plus, it saves you for the fights against the stupid, as opposed to silly, things that come along occasionally.