window function games

  • Hi there
    I'm playing a little with qualified running totals, but I'm stuck.

    Think of a "classic" line item table, consider it orders, bookings, transactions or whatever. There is a grouping element, such as a customer number, some date or ID and a payload attribute, such as an amount:

    create table orders (
    customer int,
    purchase date,
    amount int
    )
    ;

    insert into orders
    (customer, purchase, amount)
    values
    (10, getdate(), 100)
    ;
    insert into orders
    (customer, purchase, amount)
    values
    (10, getdate() + 1, 50)
    ;
    insert into orders
    (customer, purchase, amount)
    values
    (10, getdate() + 1, 25)
    ;

    insert into orders
    (customer, purchase, amount)
    values
    (20, getdate(), 20)
    ;
    insert into orders
    (customer, purchase, amount)
    values
    (20, getdate() + 1, 10)
    ;

    insert into orders
    (customer, purchase, amount)
    values
    (30, getdate() + 1, 200)

    Now say, I want all customers whose last two bookings were above 150 bucks. If there are less than two bookings but their value meets the limit, I also want them.
    In the above example, that's customer 10 and customer 30. Also I would like to specify a date range, say the last 30 days.

    But im stuck here :-/
    select
    customer,
    sum(amount) over (partition by customer order by purchase rows between 2 preceding and current row)
    from orders

    Any help appreciated 🙂
    Roger

  • Hi Roger,

    I got your desired output by using the below query. See if fits your requirement:
    select customer,SUM(consolidated_amount)as total_amount from
    (
    select b.customer,b.purchase,SUM(amount) as consolidated_amount
    from orders b
    where purchase not in( select dat from(select customer,MAX(purchase)as dat from orders
    group by customer)a)----to get 2nd maximum purchase date
    or
    purchase in( select dat from(select customer,MAX(purchase)as dat from orders
    group by customer)a) ----to get 1st maximum purchase date
    group by b.customer,b.purchase)cc
    where cc.consolidated_amount>=150
    group by cc.customer;

    Cheers,
    Saravanan

    Saravanan

  • SELECT c.CustomerID
        , SUM(t.amount) AS TotalAmount
    FROM Customer c
    CROSS APPLY (SELECT TOP 2 o.*
                FROM Orders o
                WHERE o.Customer = c.CustomerID) t
    GROUP BY c.CustomerID
    HAVING SUM(t.Amount)>150;

    ?

  • pietlinden - Sunday, September 10, 2017 12:43 PM

    SELECT c.CustomerID
        , SUM(t.amount) AS TotalAmount
    FROM Customer c
    CROSS APPLY (SELECT TOP 2 o.*
                FROM Orders o
                WHERE o.Customer = c.CustomerID) t
    GROUP BY c.CustomerID
    HAVING SUM(t.Amount)>150;

    ?

    The OP doesn't have a customer table in their sample data (although I hope they do!). If not, OP, you should consider using 3NF for your table design. Using only the data we have, one method would be using a CTE to return only the 2 latest orders, and then use a HAVING clause:
    WITH RNs AS (
      SELECT *,
            ROW_NUMBER() OVER (PARTITION BY customer ORDER BY purchase ASC) AS RN
      FROM orders)
    SELECT customer,
           SUM(amount) AS TotalAmount
    FROM RNs
    WHERE RN <=2
    GROUP BY customer
    HAVING SUM(amount) >= 150;

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Well, it's a fictitious table 😉

    My solution - after a weekend of thinking - is very similar (I'm new to SQL Server)

    What do you think? I used a sub-query and instead of row_number I went for rank. Not sure, if the sub-query would have Impacts on Performance or scalablility...

    select
      customer,
      sum(amount)
    from (
      select
        a.*,
        rank() over(partition by customer order by purchase desc) as orderNr 
      from orders a
    )
    where
      orderNr <= 2
    having sum(amount) >= 150
    group by
      customer 

  • You should have a few cases in a test that you can use to verify this works for all cases. That's what I'd do for this type of query where you might not think of all cases up front. Esp since you noted that you'd have a date range.

    Apart from that. Test this at scale in your environment. Be sure you put 1mm rows in a table and check how this performs.

    I'd also say that for readability, I would use a CTE in the last query instead of a subquery, but that's personal preference.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply