Does my database have data type drift?

Daniel Janik, 2016-07-13 (first published: 2016-06-28)

Over the years I have come to see that every database has what I call data type drift. Simply put, data type drift is when you have columns with the same name but different data types or length. I’d say about 97% of databases I’ve reviewed have some form of drift. So why is that number so high?

Have you ever gone out to eat at your favorite restaurant and noticed that something in your favorite meal was a bit off? This happened to me. My favorite Chinese restaurant had the same cook for 17 years and one day he decided to be a driver for DHL. I may have never noticed because he wasn’t visible to the front of the house but I knew the food was quite different. I must have had the same Kung Pow 100 or more times and this was not the same. I knew the owner well and he told me about the cook.

This same scenario happens over time with an application. As developers come and go, the database can drift. One developer has always used varchar(20) for FirstName and the next developer has always used nvarchar(50).

This may happen all at once as well. Consider an application that has different modules, each with it’s own developer or development team. When a data architect isn’t present and the data model is not restricted through an ERD, you get drift.

Why is this important?

When columns that don’t have the same data type are used in a join or union an implicit conversion is performed to convert one of the columns to equal the other. This results in higher consumption of I/O, processor, and memory; resulting in longer processing time for the affected queries. In short things are slower and it’s somewhat easy to fix.

How do I identify drift?

I wrote this around eleven years ago and really haven’t updated it since.

Get the script here: Microsoft TechNet Gallery

If any column is used in a union or join then it should probably be corrected.

Here’s an example of the output:

Researching each column is easy, just use the following query:

I’ve got some drift. Should I fix it?

I always recommend only updating the column if and only if it’s frequently used in a join or union with another column that does not match the data type. If you’re never going to query them both together then don’t bother. A good example of this is the [description] column. This column name is used across various tables that have nothing to do with each other. These columns are not used in joins and never will be so they won’t be matched.

When creating a new database it is recommended that you use a data modeling tool, which will enforce “domains” on the column names and any time that column name is used again it will automatically use the same data type and length.

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Does my database have data type drift?

Why is this important?

How do I identify drift?

Get the script here: Microsoft TechNet Gallery

I’ve got some drift. Should I fix it?

Rate

Share

Share

Rate

Does my database have data type drift?

Why is this important?

How do I identify drift?

Get the script here: Microsoft TechNet Gallery

I’ve got some drift. Should I fix it?

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts