One of the most common mistakes people make when writing T-SQL is using a function in the where clause, and perhaps the most common of those is the use of the Datediff function for retrieving the rows from the last x minutes/hours/days and so on.
Let’s look at the following query, which retrieves orders from the last 7 months:
SELECT * FROM Sales.Orders WHERE DATEDIFF(DAY, DateAndTime, GETDATE()) < 210 ORDER BY DateAndTime DESC
The problem is that wrapping a column with a function prevents SQL Server from performing an index seek when executing the query, which hinders performance. When running the query, we get the following execution plan:
The solution to this problem is to rewrite the query so that the column is outside of the function. In this case, the query will be rewritten to:
SELECT DateAndTime, CustomerId FROM Sales.Orders WHERE DateAndTime > DATEADD(DAY, -210 , GETDATE()) ORDER BY DateAndTime DESC
And after the rewrite, we get better performance and a better execution plan:
Pretty simple right?
Well, it is, and since this problem is pretty common, I used this solution in many performance tuning sessions I performed over the years.
There’s a slight problem though: This solution isn’t 100% accurate.
When carefully looking at the results, I find out that for the first query, I get 5859 rows, and for the second query, I get 5988 rows. Where does this difference come from?
It comes from the fact that the two functions compare dates a little differently.
The Datediff function’s unit of comparison, in this case, is the day. So it takes the day of today (we got from the Getdate function) and the day of each row in the table, and compares between them. The Dateadd function, however, takes exactly 210 days back to the exact time of the query on that day, and then checks for each row whether it’s bigger or smaller than that date and time.
And that’s why we’ve been doing it wrong for years. The Datediff and Dateadd aren’t the inverse function of each other. They operate differently.
Now, in most cases, this will probably won’t be a real problem (and in some cases the Dateadd function will actually give the accurate answer). But for the cases where it is, the solution is pretty simple: Use both functions. The Dateadd function will allow using the index, and the Datediff function will give the accuracy:
SELECT * FROM Sales.Orders WHERE DateAndTime > DATEADD(DAY, -210 , GETDATE()) AND DATEDIFF(DAY, DateAndTime, GETDATE()) < 210
The post Datediff: We’ve Been Doing It Wrong appeared first on Madeira Data Solutions.