SELECT Top 1...?

  • Hi,

    I have this OrderShipping table which has the customer shipping address.

    OrderShipping (customerid, orderid, shipname, shipaddress, shipcity, shipstate, shipzip, datestamp)

    (100, 7344, 'Sam', '123 Street', 'New York', 'NY', '12345', '11/29/2006')

    (102, 7345, 'Mike', '123 Road', 'San Francisco', 'CA', '56345', '11/24/2006')

    (103, 7346, 'Dave', '123 Blvd', 'Chicago', 'IL', '72845', '11/25/2006')

    (100, 7347, 'Sam', '123 Street', 'New York', 'NY', '12345', '11/27/2006')

    (105, 7349, 'John', '123 Avenue', 'Los Angeles', 'CA', '72845', '11/29/2006')

    (105, 7350, 'John', '123 Parkway', 'San Diego', 'CA', '92945', '11/28/2006')

    I want to select one row (one shipping address) for each customer. If a customer has placed more than one order with us, I want to select the shipping address with the most recent time stamp. In the above example, of the 6 rows, I want to select rows 1,2,3,5 and exclude rows 4 and 6. Any idea how to do this, preferably in a single select query?

    Thanks.

  • I'd do something like this:

    SELECTos1.*

    FROMOrderShipping os1

    JOIN(SELECT customerid, MAX(datestamp) AS max_datestamp FROM OrderShipping GROUP BY customerid) os2

    ONos1.customerid = os2.customerid

    AND os1.datestamp = os2.max_datestamp

  • Yeah, the above would be OK if the datestamp column was a datetime. 

    It's possible that it has been defined as varchar(10-ish), in which case you'd be advised to do :

    SELECT os1.*

    FROM OrderShipping os1

    JOIN (SELECT customerid, MAX(convert(dateTime,datestamp)) AS max_datestamp FROM OrderShipping GROUP BY customerid) os2

    ON os1.customerid = os2.customerid

    AND convert(datetime,os1.datestamp) = os2.max_datestamp

  • Ummm...  and apologies to the performance tuners who have just shrieked with outrage - yes, there are probably more efficient ways of doing the comparisons..   my point was about the data type.

  • Thanks very much.

    I have another questions. Some of the old orders imported from another system does not have the date, the datestamp field is null (it is of type DATETIME). Any idea how to make it work? If there are more than one row for the same customer with datestamp null, we can select just any one row.

    Thanks.

  • >>we can select just any one row.

    So it's arbitrary then ?

    If so, just locate 1 other data element that uniquely identifies the row to act as a tie-breaker. Looks like OrderID column should do it.

    SELECT os1.*

    FROM OrderShipping os1

    JOIN (SELECT customerid,

         MAX(convert(dateTime,datestamp)) AS max_datestamp ,

         Max(Case When datestamp IS NULL Then -1 Else OrderID End) As max_OrderId

         FROM OrderShipping GROUP BY customerid) os2

    ON os1.customerid = os2.customerid

    AND (convert(datetime,os1.datestamp) = os2.max_datestamp Or

         os1.OrderId = os2.max_OrderID)

  • You can get multiple rows if a customer places more than one order a day.  To be sure you only get one row per customer you may have to break down and use a temp table. 

    CREATE

    TABLE #LastOrder (customerid int NOT null, orderid int NOT null)

    CREATE UNIQUE CLUSTERED INDEX IX_Dedupe ON #LastOrder (customerid) WITH IGNORE_DUP_KEY

    INSERT INTO #LastOrder (cusomerid, orderid)

    SELECT customerid, orderid

    FROM OrderShipping

    ORDER BY datestamp DESC, orderid DESC

    SELECT os.*

    FROM OrderShipping os

    JOIN #LastOrder lo ON os.customerid = lo.customerid AND os.orderid = lo.orderid

    DROP TABLE #LastOrder

    There is a more elegant single-SELECT solution in SQL 2005 using the ROW_NUMBER function.

    SELECT

    *

    FROM OrderShipping

    WHERE ROW_NUMBER() OVER(PARTITION BY customerid ORDER BY datestamp DESC, orderid DESC) = 1

  • Hang on there PW, I think the CASE statement in your subquery is backwards.  The way it is written it will return -1 as the orderid for all records with a null datestamp, which won't join with anything.  Once you fix that, your ON ... OR ... clause will return two rows for every customer that has records with and without datestamps.  This version might work (assuming 1 order per day).

    SELECT

    os1.*

    FROM OrderShipping os1

    JOIN (SELECT customerid, MAX(datestamp) AS max_datestamp, max(orderid) AS max_orderid FROM OrderShipping GROUP BY customerid) os2

    ON os1.customerid = os2.customerid

        AND (os1.datestamp = os2.max_datestamp OR (os2.max_datestamp IS NULL AND os1.orderid = os2.max_orderid))

  • Or just move the MAX inside the CASE:

    Case

    When Max(datestamp) IS NULL Then -1 Else OrderID End As max_OrderId

  • Yep, definitely backwards. Good catch.

    Must ... drink ... more ... caffeine.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply