Finding difference of cumulative values in a table

  • Date GrpID RemAmt Amt

    2000-11-28 1111 1000 300

    2000-12-28 1111 1000 200

    2000-11-28 2222 3000 500

    2000-12-28 2222 3000 100

    Date GrpID RemAmt Amt

    2000-11-28 1111 700 300

    2000-12-28 1111 500 200

    2000-11-28 2222 2500 500

    2000-12-28 2222 2400 100

    I have data stored in a table in the above format.The requirement is to update this table to the one below

    ie.., for each GrpID RemAmt = RemAmt - Amt.

    eg: for GrpID 111 for the 1st date RemAmt = 1000 - 300 i.e,700

    for the next date it should be RemAmt = 700 - 200 i.e, 500

    Tried using cursor for the same, But somehow it does not seem to be working.

    It would be great if somebody could help out.

  • Ineffecient, but should work

    UPDATE mytable

    SET RemAmt = RemAmt - (SELECT SUM(a.Amt) FROM mytable a WHERE a.GrpID=mytable.GrpID AND a.Date<=mytable.Date)

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • create table #tmp(Date datetime,GrpID int,RemAmt money, Amt money)

    insert #tmp Values ('2000-11-28',1111,1000,300)

    insert #tmp Values ('2000-12-28',1111,1000,200)

    insert #tmp Values ('2000-11-28',2222,3000,500)

    insert #tmp Values ('2000-12-28',2222,3000,100)

    Select * from #tmp

    Update A Set RemAmt=RemAmt-CumAmt

    From

    (

    Select

    Date,GrpID,RemAmt,Amt,

    (Select SUM(Amt) from #tmp A Where A.GrpID=B.GrpID and A.date<=B.Date) as CumAmt

    from

    #tmp B

    )A

    Select * from #tmp

    ..

  • Strange problem you have, I'm having a hard time thinking of what the real life application of this might be. I would have expected inserts and not updates...but anyway...

    I'm not hot on CTEs, but I think you may have a solution lying there for you, alternatively, I would look at 2 cursors that do the following:

    Cursor 1 iterates the grpIDs

    Cursor 2 iterates rowids(I dont see any in your data) ordered by date and populates variable used in calc logic

    Calc Logic executes and updates row

    Cursor 2 end

    Cursor 1 end

    Just my two cents

  • Very nice sanoj!

  • Mark,Sanoj

    Thank you for the response..

    Tried out both your queries.But the issue i'm facing with the above queries is that if the date is same, the value calculated is not right

    for eg: if the first 2 dates are 2000-11-28, the RemAmt comes out as 500 for the first 2 rows, instead of 700 and 500

  • rekha_sara (10/21/2009)


    Mark,Sanoj

    Thank you for the response..

    Tried out both your queries.But the issue i'm facing with the above queries is that if the date is same, the value calculated is not right

    for eg: if the first 2 dates are 2000-11-28, the RemAmt comes out as 500 for the first 2 rows, instead of 700 and 500

    You'll need to explain how to order rows when they share the same date.

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • There is no particular order which has to be followed if both the amounts are made on the same date.It can be taken in any order

    To shed some light into what I'm doing we are bringing in data daily from one db to our db,

    and we are calculating the values based on this table.

    I hope I haven't confused you..

  • rekha_sara (10/21/2009)


    There is no particular order which has to be followed if both the amounts are made on the same date.It can be taken in any order

    To shed some light into what I'm doing we are bringing in data daily from one db to our db,

    and we are calculating the values based on this table.

    I hope I haven't confused you..

    WITH CTE AS (

    SELECT *,ROW_NUMBER() OVER(PARTITION BY GrpID ORDER BY Date) AS rn

    FROM mytable)

    UPDATE CTE

    SET RemAmt = RemAmt - (SELECT SUM(a.Amt) FROM CTE a WHERE a.GrpID=CTE.GrpID AND a.rn<=CTE.rn)

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • /*

    If there are more than one records with the same date, How do you find which transaction happend first? do you have any SrNo/InvoiceNo in the Table?. The following query leaves it to SQL Server for getting the order in case there are more than one Date.

    */

    create table #tmp(Date datetime,GrpID int,RemAmt money, Amt money)

    insert #tmp Values ('2000-11-28',1111,1000,300)

    insert #tmp Values ('2000-12-28',1111,1000,200)

    insert #tmp Values ('2000-11-28',2222,3000,500)

    insert #tmp Values ('2000-12-28',2222,3000,100)

    insert #tmp Values ('2000-12-28',2222,3000,200)

    Select * from #tmp

    go

    With tmp as

    (

    Select Date,GrpID,RemAmt,Amt,row_number() Over(Partition by GrpID Order by Date)SrNo from #tmp

    )

    Update A Set RemAmt=RemAmt-CumAmt

    From

    (

    Select

    Date,GrpID,RemAmt,Amt,

    (Select SUM(Amt) from tmp A Where A.GrpID=B.GrpID and A.SrNo<=B.SrNo) as CumAmt

    from

    tmp B

    )A

    go

    Select * from #tmp

    ..

  • Mark,Sanoj..

    Thank you!! The O/p is exactly what I wanted..But the problem is in reality, I'm dealing with more than 400,000 records and the query is going on more than 30 min

    and I had to finally cancel it.

  • You can use the "quirky update" method for this kind of problem

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • rekha_sara (10/21/2009)


    Mark,Sanoj..

    Thank you!! The O/p is exactly what I wanted..But the problem is in reality, I'm dealing with more than 400,000 records and the query is going on more than 30 min

    and I had to finally cancel it.

    Heh... I wondered when that was going to happen with the code. Anything that uses an inequality like <= for an aggregate forms a triangular join that will eat the face off you server in the presence of any decent row count. See the following URL for why...

    http://www.sqlservercentral.com/articles/T-SQL/61539/

    The article you were pointed to has been rewritten and won't be republished until 10 Nov. I'll try to get to this tonight after work... the "Quirky Update" method will do this problem in less than 7 seconds on a million rows. A cursor will certainly work but it will take about 8 MINUTES. The triangular join method you're using would probably finish sometime next week if it doesn't manage to crash the server first.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You can update the Values using the quirky update as follows.

    create table #tmp(Date datetime,GrpID int,RemAmt money, Amt money)

    insert #tmp Values ('2000-11-28',1111,1000,300)

    insert #tmp Values ('2000-12-28',1111,1000,200)

    insert #tmp Values ('2000-11-28',2222,3000,500)

    insert #tmp Values ('2000-12-28',2222,3000,100)

    insert #tmp Values ('2000-12-28',2222,3000,50)

    insert #tmp Values ('2000-12-29',2222,3000,75)

    insert #tmp Values ('2000-12-30',1111,1000,50)

    insert #tmp Values ('2000-12-20',2222,3000,30)

    create clustered index CLIDXTMP ON #tmp(GrpID,Date)

    Declare

    @CumAmt money,

    @GrpID int

    Set @CumAmt=0

    Select * from #tmp

    update #tmp set

    @CumAmt=(CASE WHEN @GrpID<>GrpID THEN Amt ELSE @CumAmt+Amt END),

    RemAmt=RemAmt-@CumAmt,

    @GrpID=GrpID

    Select * from #tmp

    But remember that this trick of update is not documented by microsoft and a future patch can break your code

    ..

  • Just a word of caution... I strongly recommend that you use OPTION (MAXDOP 1) in order to prevent any parallelism during the "Quirky Update' which would destroy the procedural nature of the update. It's also helpful (even on Temp Tables) to use the WITH (TABLOCKX) table hint just to save a bit on row to table lock escalation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply