Well I’m not
a data modeler, nor do I play one in the corporate world. But it is TSQLTuesday No. 72, and I have my
tales to tell. Mickey Stewe, (b|t),
aka @SQLMickey, database architect, whose focus is on data modeling, has
invited us all to talk on the topic of, well, no surprise, data modeling. Her
invite has been posted on her blog here. Who else but a data modeling
expert would know when things go wrong, and that is what we discuss today. I have worked hands-on closely with
internationally renowned data architects in various places, and that is not an
understatement. No names today. I’ve certainly learned many things, but none
more so than what can go wrong with data modeling.
Data modeling
is a distinct (no pun intended) profession, not a core responsibility of a
typical DBA. I have had my hand at architecting small database structures, and
improving bigger ones, as performance lags.
On a basic level, everyone should know about database normalization for
relational databases. We know
normalization is the technique for organizing data in the database in order to minimize
data redundancy. This can optimize data retrieval operations, and ensure referential
integrity. The typical optimal design standards are the 1st, 2nd
and 3rd normal form. Yes
there are more, but then again we’re talking about things that go wrong. So here is a great primer of Data Modeling 101,
that everyone attempting to design a database should see. Quick primer, on data
modeling basics of the 3 forms of normalization.
First Normal Form
- Eliminate
repeating groups in individual tables.
- Create
a separate table for each set of related data.
Second Normal Form
- Create
separate tables for sets of values that apply to multiple records.
- Relate
these tables with a foreign key.
Third Normal Form
- Eliminate
fields that do not depend on the key.
So, there are
times you want to actually de-normalize your
database, but I’ll leave that for the experts above my DM pay-grade. However, often times, such as when developers
are designing the database model, don’t follow the simple rules, database chaos
ensues. It might not be apparent at the
outset, but as time goes by, and your SQL Server starts to creak, and data
grows exponentially, only then do you realize, wow, this thing is not designed
to scale.
One of the
biggest faux pas I’ve seen over the years is when a 3rd party vendor
has a great app, but apparently they didn’t bother hiring a data modeler. Many of these applications start out on other
platforms, such as Access, or Btrieve (remember that?), and such legacy
applications are migrated to SQL Server for the data backend. This often results in dozens of tables
without any relations defined and no primary or foreign keys. Half the tables
are 50 columns wide and no indexes. Most
of these tables have no understandable layout, no proper naming convention – ie
Table001, Table002, etc. Datatypes are
fudged in migration utilities, and everything comes over as a varchar (max). Oh, and by the way, you can’t place an index
on a varchar(max) column, so we should ideally limit its use. This of course wastes space, and results in an
awfully shoddy database design that won’t scale, and of course, performance
will suffer, customers will complain, and businesses will be jeopardized.
Not too long
ago at a client, one to remain nameless to protect me from any lawsuit J , was chugging along on its SQL Server,
until logging in took > 10 minutes thru the front end, and updating or retrieving
records in excess of >10 minutes.
That’s a life time for an end-client to wait, an absolutely unacceptable
for any business to survive. I even
cringed at the prospect of coming close to touching it, but was answering the
call of duty and plea from the CEO. We
made a lot of positive changes but a long way to go. This database, as it were, could not have
possible been designed by a data modeler or architect. I was told it was designed by one of the
premier database architects in the country.
Well, probably one of the premier developers,
but again, not naming names.
Even so, what
was created at the outset was changed and modified nteen times by revolving
group of developers, and certainly not the same pristine database design that
it started out to be. So, we cannot
fault the original database architect here, it’s been completely bastardized
since the original design. Therefore,
the biggest point in this discussion, is that not only must an awesome database
be perfectly designed, the data model must be maintained, and flexible to adapt
to changing requirements. New tables, column
views, stored procedures, functions, indexes, etc. may need to be added, modified,
or dropped. Therefore, the database
schema must be versioned, and the original model preserved.
My favorite
analogy to the aforementioned client is The
Nine Circles of Developer Hell, cleverly crafted by Kendra Little
of Brent Ozar Unlimited. While the
client’s application is still a work in progress, we categorized the database
as being in the 6th circle of
code hell – code heresy. We were able to bring it back from the 7th
circle - violent code, but there is long way thru the River Stix, to get
back to the 1st Circle, Code Limbo. There is a lot that can go
wrong. Before I signed up to help this
client, the contract should have read “Abandon
Hope All Ye Who Enter Here”
Adam
Machanic’s (b|t), one of the premier SQL
developers, started the T-SQL Tuesday blog party eons ago. Well, back in 2009. Here are the rules to T-SQL Tuesdaydom:
- Write a
blog post about the topic. Don’t have a blog? Start one!
- Include
the T-SQL Tuesday Logo and link it back to the original invitation post.
- Publish
your blog post Tuesday, November 10, 2015, between 00:00 GMT & 23:59
GMT.
- Leave a
reply on the original invite with a link to your blog post (for the
round-up).
- Share you
post with the community! Tweet it out using the #tsql2sday hashtag
Thanks for
hosting Mickey!
And of
course, if anyone is interested in learning more about my book Healthy SQL – A
Comprehensive Guide to Healthy SQL Server Performance, published by Apress, you can go to the url:
http://bit.ly/orderHealthySQLnow
You can also
get the book on Amazon: http://bit.ly/HealthySQLonAmazon
For all
things SQL, news, events, jobs, info, and other fun tweets, follow me on
twitter @Pearlknows and join the #HealthySQL campaign to
keep your SQL Servers healthy!