Running total with a twist

  • Hi!

    I'm trying to optimize a cursor-solution for following problem:

    A customer can have a number of orders with a total value and a credit limit.
    For instance:
    Customer A:
    Credit limit 100 $
    Order 1: 50$
    Order 2: 30$
    Order 3: 30$

    The idea is to filter out orders per customer with total sum "fit inside" the limit without going over it sorted by order number, in this case Order 1 + 2.
    This case is quite easy to solve with a SUM OVER ORDER BY.
    But there can be customer with following structure:

    Customer B:
    Credit limit 100 $
    Order 1: 50$
    Order 2: 70$
    Order 3: 20$
    Order 4: 20$

    In this case, Order 1, 3, 4 together should fit inside his limit.

    So the logic is: take order one by one sorted by order number, if accumulated order value is inside the limit, this order is "OK". If order is over the limit, discard it, and proceed to next order.

    Is there a way to solve it without loops and cursors? Note, that we don't want to optimize solution to fit as many orders as possible inside the limit, since this problem is NP-complete i think.

    //Sigge

  • siggemannen - Friday, May 4, 2018 5:34 AM

    Hi!

    I'm trying to optimize a cursor-solution for following problem:

    A customer can have a number of orders with a total value and a credit limit.
    For instance:
    Customer A:
    Credit limit 100 $
    Order 1: 50$
    Order 2: 30$
    Order 3: 30$

    The idea is to filter out orders per customer with total sum "fit inside" the limit without going over it sorted by order number, in this case Order 1 + 2.
    This case is quite easy to solve with a SUM OVER ORDER BY.
    But there can be customer with following structure:

    Customer B:
    Credit limit 100 $
    Order 1: 50$
    Order 2: 70$
    Order 3: 20$
    Order 4: 20$

    In this case, Order 1, 3, 4 together should fit inside his limit.

    So the logic is: take order one by one sorted by order number, if accumulated order value is inside the limit, this order is "OK". If order is over the limit, discard it, and proceed to next order.

    Is there a way to solve it without loops and cursors? Note, that we don't want to optimize solution to fit as many orders as possible inside the limit, since this problem is NP-complete i think.

    //Sigge

    You've been here long enough to know that you should provide DDL & corresponding INSERT statements if you want someone to provide a working solution.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Sorry about that:

    CREATE TABLE #t_customers (cust_id INT PRIMARY KEY, limit NUMERIC(19, 2))
    CREATE TABLE #t_orders (cust_id INT, order_no INT PRIMARY KEY, tot_value NUMERIC(19, 2))

    INSERT INTO #t_customers (
        cust_id
    ,    limit
    )
    VALUES    (1, 100), (2,100)

    INSERT INTO #t_orders (
        cust_id, order_no, tot_value
    )
    VALUES
         (1, 1, 50)
    ,    (1, 2, 30)
    ,    (1, 3, 30)
    ,    (2, 4, 50)
    ,    (2, 5, 70)
    ,    (2, 6, 20)
    ,    (2, 7, 20)

    SELECT    *
    FROM    #t_customers
    SELECT    *
    FROM    #t_orders

    --Expected result:

    SELECT    *
    FROM    (
        VALUES (1, 1)
        ,    (1, 2)
        ,    (2, 4)
        ,    (2, 6)
        ,    (2, 7)
    ) t (cust_id, order_no)

  • A recursive CTE works for this, and could provide you with a template for a quirky update:


    ;WITH CalculatedData AS (
     SELECT
      c.cust_id, o.order_no, o.tot_value, c.limit
     FROM #t_customers c
     INNER JOIN #t_orders o
      ON o.cust_id = c.cust_id
    ),
    rCTE AS (
     SELECT ThisRow.*,
      RT = tot_value,
      x.Keeper
     FROM CalculatedData ThisRow
     CROSS APPLY (SELECT Keeper = 0) x
     WHERE order_no = 1
     UNION ALL
     SELECT ThisRow.*,
      RT = CAST(CASE
       WHEN x.keeper = 1 THEN [LastRow].RT + ThisRow.tot_value
       WHEN x.keeper = 2 THEN [LastRow].RT 
       WHEN x.keeper = 0 THEN ThisRow.tot_value END
       AS NUMERIC(19,2)),
      x.Keeper
     FROM CalculatedData ThisRow
     INNER JOIN rCTE [LastRow]
      ON [LastRow].order_no + 1 = ThisRow.order_no
     CROSS APPLY (SELECT Keeper = CASE
       WHEN [LastRow].cust_id <> ThisRow.cust_id THEN 0
       WHEN [LastRow].cust_id = ThisRow.cust_id AND [LastRow].RT + ThisRow.tot_value <= ThisRow.limit THEN 1
       WHEN [LastRow].cust_id = ThisRow.cust_id THEN 2 
       END) x
    )
    SELECT * FROM rCTE WHERE Keeper <> 2

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Doh, of course, recursive CTE! Thanks!
    I added some more testdata and rewrote your code to work with it 

    CREATE TABLE #t_customers (cust_id INT PRIMARY KEY, limit NUMERIC(19, 2))
    CREATE TABLE #t_orders (cust_id INT, order_no INT PRIMARY KEY, tot_value NUMERIC(19, 2))

    INSERT INTO #t_customers (
        cust_id
    ,    limit
    )
    VALUES    (1, 100), (2,100), (3,100)

    INSERT INTO #t_orders (
        cust_id, order_no, tot_value
    )
    VALUES    (1, 1, 50)
    ,    (1, 2, 30)
    ,    (1, 3, 30)
    ,    (2, 4, 50)
    ,    (2, 5, 70)
    ,    (2, 6, 20)
    ,    (2, 7, 20)
    ,    (3, 8, 120)
    ,    (3, 9, 90)
    ;WITH CTE AS (
            SELECT    c.cust_id, o.order_no, o.tot_value, c.limit
            ,    ROW_NUMBER() OVER(PARTITION BY c.cust_id ORDER BY order_no) AS sort -- Get sort for each customer, so we can use rCTE easier
            FROM    #t_customers c
            INNER JOIN #t_orders o
                ON    o.cust_id = c.cust_id
        )
    ,    CTE2 AS (
            SELECT    c.cust_id
            ,    c.order_no
            ,    l.inside
            ,    CAST(c.limit - inside * c.tot_value AS NUMERIC(19, 2)) AS credit_left
            ,    c.sort
            FROM    CTE c
            CROSS APPLY (
                    SELECT    CASE WHEN c.tot_value <= c.limit THEN 1 ELSE 0 END AS inside
                ) l
            WHERE    c.sort = 1 -- "anchor" row

            UNION ALL
            SELECT    c.cust_id
            ,    c.order_no
            ,    l.inside
            ,    CAST(c2.credit_left - l.inside * c.tot_value AS NUMERIC(19,2)) AS credit_left
            ,    c.sort
            FROM    CTE c
            INNER JOIN CTE2 c2
                ON    c2.cust_id = c.cust_id
                AND    c2.sort + 1 = c.sort    -- Next order
            CROSS APPLY (
                    SELECT    CASE WHEN c2.credit_left >= c.tot_value THEN 1 ELSE 0 END AS inside
                ) l
        )
    SELECT    cust_id, order_no
    FROM    CTE2 c
    WHERE    c.inside = 1
    ORDER BY cust_id, order_no

  • siggemannen - Sunday, May 6, 2018 6:31 AM

    Doh, of course, recursive CTE! Thanks!
    I added some more testdata and rewrote your code to work with it 

    CREATE TABLE #t_customers (cust_id INT PRIMARY KEY, limit NUMERIC(19, 2))
    CREATE TABLE #t_orders (cust_id INT, order_no INT PRIMARY KEY, tot_value NUMERIC(19, 2))

    INSERT INTO #t_customers (
        cust_id
    ,    limit
    )
    VALUES    (1, 100), (2,100), (3,100)

    INSERT INTO #t_orders (
        cust_id, order_no, tot_value
    )
    VALUES    (1, 1, 50)
    ,    (1, 2, 30)
    ,    (1, 3, 30)
    ,    (2, 4, 50)
    ,    (2, 5, 70)
    ,    (2, 6, 20)
    ,    (2, 7, 20)
    ,    (3, 8, 120)
    ,    (3, 9, 90)
    ;WITH CTE AS (
            SELECT    c.cust_id, o.order_no, o.tot_value, c.limit
            ,    ROW_NUMBER() OVER(PARTITION BY c.cust_id ORDER BY order_no) AS sort -- Get sort for each customer, so we can use rCTE easier
            FROM    #t_customers c
            INNER JOIN #t_orders o
                ON    o.cust_id = c.cust_id
        )
    ,    CTE2 AS (
            SELECT    c.cust_id
            ,    c.order_no
            ,    l.inside
            ,    CAST(c.limit - inside * c.tot_value AS NUMERIC(19, 2)) AS credit_left
            ,    c.sort
            FROM    CTE c
            CROSS APPLY (
                    SELECT    CASE WHEN c.tot_value <= c.limit THEN 1 ELSE 0 END AS inside
                ) l
            WHERE    c.sort = 1 -- "anchor" row

            UNION ALL
            SELECT    c.cust_id
            ,    c.order_no
            ,    l.inside
            ,    CAST(c2.credit_left - l.inside * c.tot_value AS NUMERIC(19,2)) AS credit_left
            ,    c.sort
            FROM    CTE c
            INNER JOIN CTE2 c2
                ON    c2.cust_id = c.cust_id
                AND    c2.sort + 1 = c.sort    -- Next order
            CROSS APPLY (
                    SELECT    CASE WHEN c2.credit_left >= c.tot_value THEN 1 ELSE 0 END AS inside
                ) l
        )
    SELECT    cust_id, order_no
    FROM    CTE2 c
    WHERE    c.inside = 1
    ORDER BY cust_id, order_no

    With no offense meant to Chris, a well written WHILE loop might actually do better than the RBAR of a Recursive CTE and use fewer resources, too boot.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • siggemannen - Friday, May 4, 2018 5:34 AM

    Hi!

    I'm trying to optimize a cursor-solution for following problem:

    A customer can have a number of orders with a total value and a credit limit.
    For instance:
    Customer A:
    Credit limit 100 $
    Order 1: 50$
    Order 2: 30$
    Order 3: 30$

    The idea is to filter out orders per customer with total sum "fit inside" the limit without going over it sorted by order number, in this case Order 1 + 2.
    This case is quite easy to solve with a SUM OVER ORDER BY.
    But there can be customer with following structure:

    Customer B:
    Credit limit 100 $
    Order 1: 50$
    Order 2: 70$
    Order 3: 20$
    Order 4: 20$

    In this case, Order 1, 3, 4 together should fit inside his limit.

    So the logic is: take order one by one sorted by order number, if accumulated order value is inside the limit, this order is "OK". If order is over the limit, discard it, and proceed to next order.

    Is there a way to solve it without loops and cursors? Note, that we don't want to optimize solution to fit as many orders as possible inside the limit, since this problem is NP-complete i think.

    //Sigge

    Cursor looks like the best bet rather than loops because for each iteration of the loop, you'll need to do a lookup and then spool the keys of the detail that fit the loop. If you want to use modern windowing queries, is it possible that you could do the opposite of what my bank used to do, ie., fill the orders from the lowest dollar amount to the highest? Sure, probably not optimal for all orders, but you said you didn't want to search that NP-incomplete looking solution space anyways, just an idea! Save the running total (even if its over the amount) to another table, then just select the rows where the running total ISN'T over the credit limit and then apply those rows to the credit limit. The programming then becomes much simpler, but then you WILL be deprioritizing orders based on increasing price, is it possible that you can let folks prioritize individual orders then apply that to your windowing function?

  • I agree with Patrick, There are edge cases that are not described in the original problem.  Is the challenge to get as many orders in as possible, or get as close as possible to the credit limit.  The first is easiest, just sort by lowest to highest.  The second is much harder as you will need to consider an indeterminate number of combinations for each customer. you could reduce your workload by filtering out customers who's entire order sum is inside their credit limit, customers who have no orders inside their credit limit and customers who only have one order inside their credit limit.

    It feels like a variation on the travelling salesman or box packing problem.

    No good solutions come to mind.  How many customers are we talking about?

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply