I missed the very first T-SQL Tuesday, so when this month’s topic of second chances came up, I decided to write that one.
If you are unsure of what T-SQL Tuesday is, follow the link to this month’s topic to get the rules and description and then write a blog post.
Date/Time Challenges
I picked an easy one, but one that I continue to see asked in the forums by people new to SQL Server. I suspect we’ll see less questions over time as more people take advantage of the new DATE and TIME datatypes in SQL Server 2008 and later, but maybe not. Lots of people are still sure that they need to keep those items together.
In any case, have you ever seen sales data like this:
OrderID OrderDate CustomerID OrderAmount
———– ———————– ———– ————
1 1982-05-19 06:31:48.950 1 579040.5070
2 1994-11-27 17:14:41.790 2 348808.5860
3 1972-11-08 17:40:01.170 3 758992.3650
4 1972-05-31 01:19:05.530 4 779853.1990
5 1994-12-22 10:40:57.410 5 666173.8040
6 1974-04-03 01:42:29.490 6 134218.2330
7 1976-06-22 15:21:18.910 7 322938.6950
8 1953-08-05 23:00:34.620 8 14169.7580
9 1971-08-16 22:33:28.970 9 586057.3820
10 2002-03-28 13:08:00.420 10 632785.0760
Here’s some DDL, you can create your own data, but here are a few rows:
CREATE TABLE SalesOrders ( OrderID INT IDENTITY(1,1) , OrderDate DATETIME , CustomerID INT , OrderAmount NUMERIC(10, 4) ) INSERT SalesOrders (OrderDate, CustomerID, OrderAmount) VALUES ( '1982-05-19 06:31:48.950', 1, 579040.5070), ( '1994-11-27 17:14:41.790', 2, 348808.5860), ( '1972-11-08 17:40:01.170', 3, 758992.3650), ( '1972-05-31 01:19:05.530', 4, 779853.1990), ( '1994-12-22 10:40:57.410', 5, 666173.8040)
If I want to get all the sales in May of 1972, I get query them all like this:
SELECT OrderDate , OrderAmount FROM SalesOrders WHERE OrderDate > '1972/5/1' AND OrderDate <= '1972/5/31'
I get 4 rows back. That’s something people often write when they get input from a user. A user has a start and end edit box, they enter “1972/5/1′” in the start box (or use a calendar picker) and then enter “1972/5/31” in the other. I’m using ISO dates to make this clear, though in the US it would normally display like “5/1/1972” and in the UK as “1/5/1972”.
However, that isn’t quite correct. If I run this query:
SELECT OrderDate , OrderAmount FROM SalesOrders WHERE MONTH(OrderDate) = 5 AND YEAR(OrderDate) = 1972
I actually get 6 rows. The data rows for May 1972 are:
OrderDate OrderAmount
———————– —————————-
1972-05-31 01:19:05.530 779853.1990
1972-05-13 09:26:52.590 676848.9700
1972-05-28 21:05:28.840 923425.0510
1972-05-07 10:59:09.930 266079.6480
1972-05-01 04:21:01.250 464241.6480
1972-05-31 01:19:05.530 779853.1990
What’s happening?
If you look at the OrderDates for May 31, you see two values that have a time of 1:19:05am. Those are excluded from the query, which has an end date of “1972/5/31”. Why? This query:
SELECT CAST( '1972/5/31' AS DATETIME)
shows why. It returns:
1972-05-31 00:00:00.000
That’s midnight between the 30th and 31st, which is before 1:19:05am. When a datetime value is converted in SQL Server, without a time component, it defaults to the beginning of the day. That works great for the start date, not so good for the end date.
Fixing this
There are two real fixes here. Well, maybe more. You can query on the month and year, but those functions can disrupt indexes, so I don’t recommend them. The two main fixes are:
- add a time component
- add a day
The first fix is the addition of the last time of the day to your query.
SELECT OrderDate , OrderAmount FROM SalesOrders WHERE OrderDate > '1972/5/1' AND OrderDate <= '1972/5/31 23:59:59.997PM'
In some type of code, it looks more like this:
DECLARE @end DATETIME SELECT @end = '1972/5/31' SELECT @end = @end + '23:59:59.997' SELECT OrderDate , OrderAmount FROM SalesOrders WHERE OrderDate > '1972/5/1' AND OrderDate <= @end
Assume the first select is actually coming from the user.
The second fix is to add a day, and actually query like this, which is what I’d recommend:
SELECT OrderDate , OrderAmount FROM SalesOrders WHERE OrderDate > '1972/5/1' AND OrderDate < '1972/6/1'
In this case, instead of querying for May 31, we move to June 1 (the next day) and then change the <= to a < only. This gets all orders occurring up until the end of May 31, but before June 1.
Easy fixes, but so often there’s code that doesn’t allow for the time component. Take a minute and check your reports, and be sure you aren’t underreporting any data to your clients or customers.
Also be sure to check out the new datatypes in SQL Server 2008 and later:
- Date
- Time
- DateTime2 (better precision)
- Datetimeoffset (seriously consider this)
Filed under: Blog Tagged: sql server, syndicated, T-SQL, T-SQL Tuesday