Earlier, i discussed FIRST_VALUE() function. FIRST_VALUE() returns the first value from an ordered set of records. Similarly LAST_VALUE() returns the last value for an ordered result set, but you are in for a bit of a surprise if you aren’t familiar with the enhancements to the SQL windows functions.
I’ll use the same example data i used earlier.
CREATE TABLE Customers
(CustId INT, OrderDate DATE, OrderAmount MONEY) ;
INSERT Customers VALUES
(1, ’2011-12-03′, 12500)
, (1, ’2011-10-07′, 15000)
, (1, ’2011-09-12′, 16000)
, (2, ’2011-07-11′, 900)
, (2, ’2011-09-13′, 18000)
, (2, ’2011-11-03′, 7000)
, (3, ’2011-01-29′, 19000)
, (3, ’2011-04-19′, 1000)
, (3, ’2011-07-09′, 12000) ;
We wrote a query using FIRST_VALUE() to calculate the date of first purchase for each customer.
SELECT
c.CustId, c.OrderDate, c.OrderAmount
, FIRST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate) CustomerSince
FROM
Customers c
Now, let’s say we want to know the recent purchase date for each customer. Similar to FIRST_VALUE(), there is a LAST_VALUE() function. LAST_VALUE() allows you to return the last value from an ordered result set.
I’ll follow my killer instincts and replace FIRST_VALUE() with LAST_VALUE() in the above query. It should work. Right?
SELECT
c.CustId, c.OrderDate, c.OrderAmount
, LAST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate) CustomerSince
FROM Customers C
Result Set:
… but, oops, this is not the result i was expecting.
There are two points to note here. Window and Frame.
A window is a set of records that are defined by the OVER() clause. Because of the PARTITION BY Custid, you can safely think records for each Custid form a window in this example.
A Frame is a range of rows within a window; a subset of a window in other words, defined by the ROWS clause. A range has boundaries – a begin position and an end position.
Look back at the query using LAST_VALUE(). You’ll notice there wasn’t a ROWS clause. We haven’t defined a frame. In this case SQL Server uses the default frame which is, between the first row without any boundaries (first row of the window partition) and the current row. Last value from this default frame will always be the current row.
Correct usage of the function should instead be:
SELECT
c.CustId, c.OrderDate, c.OrderAmount
, LAST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
FROM Customers C
Given all this, FIRST_VALUE() is providing expected results even with the default frame because the first row in the default frame is the first row of the window partition.
Here is an interesting article from Jeremiah Peschka (blog) for further reading: Leaving the Windows Open.
~Sam
Follow Sam on Twitter - @SamuelVanga