This is embarrassing, and thanks for Cary for complaining to me and forcing me to look at things.
Awhile back I had a request by someone to reorder the Virtual Briefcase by date rather than name since this user had a bunch of articles, something like 30 or 40, that they'd saved in there and wanted to find them based on a time period. It made sense to me and so rather than change around a bunch of stuff, I altered the existing stored procedure and copied the current query, adding a new order by with the date field. I then wrapped the two queries in an IF statement to switch between them on a parameter and added the parameter to the page. Uploaded it, tested to be sure it didn't break anything and implemented it.
And life was good until Carey pinged me to say that he had articles showing up in this order:
11/04/2003 11/02/2004 10/30/2003 10/29/2003 10/22/2002
Even someone that's been doing lots of things besides development can see something wrong here. So I dug in and found something interesting that I thought some others might want to know. I'm sure lots of you know the issue, but for those of you that don't, this is one that might get you or your developers if you are not careful.
The Problem
I'll use Northwind for an example, since it's one that most DBAs have access to on a test server. You can also reset it up from the SQL Server media if you need to. Let's look at the orders table, in particular, this query:
select top 5 orderid , customerid , requireddate from orders order by requireddate desc
This is straightforward, give me the 5 orders that need to go out by the most forward "requireddate" on back. Running this gives me
orderid customerid requireddate ----------- ---------- ------------------------------------------------------ 11061 GREAL 1998-06-11 00:00:00.000 11059 RICAR 1998-06-10 00:00:00.000 11074 SIMOB 1998-06-03 00:00:00.000 11075 RICSU 1998-06-03 00:00:00.000 11076 BONAP 1998-06-03 00:00:00.000
Notice that I have the June 11, 98 order first, then the June 10, the 3rd, etc. This is what I wanted, giving me the orders that will be due and working backwards. Running something like this with some flag to note if I've processed them allows me to see the newest orders coming in. I know it's a little funny, but the dates made more sense with the requireddate than the orderdate in this example.
So now suppose that your developer says that the date time thing messes him up and can you please just send back the date formatted a little nicer. Ok, no problem, so you change the query:
select top 5 o.orderid , o.customerid , CONVERT(char(10), o.requireddate, 101) as requireddate from orders o order by requireddate desc
Easy enough, we add a convert to send this back as a character value instead of a datetime and then we use the 101 conversion code to specify that we want mm/dd/yyyy (see BOL - Cast and Convert. Because this is going back to a client and it's not a query we're reading in QA, we want to give the column a name that can be referenced. So we choose the same name, requireddate, to make it easier on the developer. How many of you have changed a name and had a developer scream? And if they didn't, it it because they reference values by position? Tsk, tsk, that's a no-no.
In any case, you might run this, see that it runs and send it on. But did you examine the results carefully? They look like this:
orderid customerid requireddate ----------- ---------- ------------ 10763 FOLIG 12/31/1997 10764 ERNSH 12/31/1997 10370 CHOPS 12/31/1996 10371 LAMAI 12/31/1996 10761 RATTC 12/30/1997
Notice that the first date isn't the June 11th, 1998, but rather December 31st, 1997!?!?!!!?
Whoopsi. Hopefully things weren't too date dependent.
Actually that's a major insert=your-favorite-four-letter-word-here mistake. Most DBAs would be ripping a developer that made that kind of mistake and I'm proud to say that I did it. Actually I'm not that proud, but I am willing to admit my mistake and expose myself to your ridicule. Please use the forum under "Your Opinion" below or the "Join the Discussion" above and don't blast my email.
Still, it's interesting and I bet a few people get caught by this because adding a convert doesn't. While I'm sure that there are some better explanations and more technical ones, I'll give you mine because I think it's easier to understand.
I grabbed the query plan and put it up here:
Notice the order in which things are done. First the rows are retrieved. Then the new column is computed, this being the CONVERTed date field. Lastly the sort takes place, not on the original column value as stored in the table, but rather the new computer column, as changed to a character format. In this case, the worktable in tempdb, which houses only the result data, including the computed character column, is used for the sorting. So you get 12/31 as a higher value than 06-11 and so it comes first.
So how do you avoid this?
First of all, you read lots of articles like this one where someone else has made the mistake, you chuckle at them, and then it sticks in your mind as something to watch out for. Next you rename columns when you change their values. That way when someone comes down the road and adds a new order by or group by, it doesn't affect things. That may not always work and there are definitely reasons to keep the column name the same, so then you rely on rule 1 🙂
The other thing you do, and which I did not, is test carefully. Even the smallest change can affect results in ways you did not think of. That's what happened here and I thought that a simple order by would solve things. I didn't check the data carefully. I most likely used my briefcase, which has 3 articles in it, so the issue may not have even appeared for me.
As with most things, this came down to proper QA and testing. We all move too fast at times and sometimes it matters, and sometimes it doesn't. Fortunately at SQLServerCentral.com, we're providing information and most of the services we have are simple conveniences, not ones that make your business succeed or fail. We test differently for different things because time is involved and those things that don't matter as much, such as the Briefcase, get less testing than things like the shopping cart.
I'm sure you see similar things in your company. Mistakes and bugs will slip through. They do in all software, even in lots of other industries. What you try your best to do is learn and make fewer ones as you move forward.
Got an opinion or comment, add it using the "Your Opinion" button below.
Steve Jones ©2005 dkranch.net