Running Balance Updation

  • I have below table and few rows as eg

    Create Tableledger

    (

    row_noint,

    acct_idvarchar(255),

    debitnumeric(28,2),

    creditnumeric(28,2),

    balancenumeric(28,2)

    )

    row_noacct_idtran_descdebitcreditbalance

    0Account1Opening BalanceNULLNULL1000.00

    0Account2Opening BalanceNULLNULL500.00

    1Account1SOBI0.00200.00NULL

    2Account1SNP300.000.00NULL

    1Account2SNP100.00200.00NULL

    I need to update the balance column as running balance based on acct_id column as previous balance + debit - credit having row_no > 0

    row_noacct_idtran_descdebitcreditbalance

    0Account1Opening BalanceNULLNULL1000.00

    0Account2Opening BalanceNULLNULL500.00

    1Account1SOBI0.00200.00800.00

    2Account1SNP300.000.001100.00

    1Account2SNP100.00200.00400.00

    I have achieved this using While loop by including sno column in table. But it is taking long time to process over 9500 rows and also by Cursor

    Is there any quicker way to achieve this?

  • Quick suggestion, check the articles on this page Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5[/url]

    😎

  • vikramchander90 (4/26/2015)


    I have below table and few rows as eg

    Create Tableledger

    (

    row_noint,

    acct_idvarchar(255),

    debitnumeric(28,2),

    creditnumeric(28,2),

    balancenumeric(28,2)

    )

    row_noacct_idtran_descdebitcreditbalance

    0Account1Opening BalanceNULLNULL1000.00

    0Account2Opening BalanceNULLNULL500.00

    1Account1SOBI0.00200.00NULL

    2Account1SNP300.000.00NULL

    1Account2SNP100.00200.00NULL

    I need to update the balance column as running balance based on acct_id column as previous balance + debit - credit having row_no > 0

    row_noacct_idtran_descdebitcreditbalance

    0Account1Opening BalanceNULLNULL1000.00

    0Account2Opening BalanceNULLNULL500.00

    1Account1SOBI0.00200.00800.00

    2Account1SNP300.000.001100.00

    1Account2SNP100.00200.00400.00

    I have achieved this using While loop by including sno column in table. But it is taking long time to process over 9500 rows and also by Cursor

    Is there any quicker way to achieve this?

    Yes... as Eirikur has suggested, there is a MUCH quicker way that will update a million rows in about 3 seconds. But, before I can show you how to do it, I absolutely need to know what you currently have for the PK of the table AND what you have for the Clustered Index of the table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi,

    Currently i do not have Primary key defined for table. But i created an index for row_no and account_id column.

  • vikramchander90 (4/26/2015)


    Hi,

    Currently i do not have Primary key defined for table. But i created an index for row_no and account_id column.

    Does the Row_No column have only unique values in it? And, since it IS a ledger table, does it have a DATETIME column in it, as well?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Row_no will be unique if u do group by acct_id otherwise it will not be unique. I have a sno column which will be unique overall. Yes there are few more columns in table which i have not mentioned here.

  • vikramchander90 (4/26/2015)


    Row_no will be unique if u do group by acct_id otherwise it will not be unique. I have a sno column which will be unique overall. Yes there are few more columns in table which i have not mentioned here.

    I need to know the name of the DATETIME column that tracks when the transaction became "real".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • tran_date is the column name for datetime.

    All the columns present in table ara

    snoint,

    row_noint,

    company_codevarchar(255),

    tran_datedatetime,

    acct_idvarchar(255),

    acct_namevarchar(255),

    tran_descvarchar(4000),

    namevarchar(255),

    tran_idvarchar(255),

    srv_doc_idvarchar(255),

    srv_typevarchar(255),

    debitnumeric(28,2),

    creditnumeric(28,2),

    balancenumeric(28,2)

  • Hi Jeff,

    Is there anything else required from my side please let me know.

  • My apologies. I've been working on two major projects at work and this slipped through the cracks. I'll try to get to this after work tonight.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No issues Jeff. Thanks for your help.

  • I was at work until 2AM this morning but I did manage to work on your project a bit. I never provide answers to such important data without a million row test table.

    With that in mind, please run the following in your test database and let me know if the data is close enough to what you might expect in your real ledger table. Let me know either way. I'll be back tonight.

    This is the table I built which, of course, is missing some of the non pertinent columns. There MUST be a UNIQUE Clustered index on the SNO column or this will take longer than a cursor when we update the balance from a TempTable... a lot longer. This will also help your non-clustered indexes a bit. I also had to make some assumptions as to the nullablity of some of the columns. I've also renamed this table just to prevent any accidents on the real table.

    CREATE TABLE dbo.TestLedger

    (

    sno INT NOT NULL

    ,row_no INT NOT NULL

    ,tran_date DATETIME NOT NULL

    ,acct_id VARCHAR(255) NOT NULL

    ,tran_desc VARCHAR(4000)

    ,debit NUMERIC(28,2)

    ,credit NUMERIC(28,2)

    ,balance NUMERIC(28,2)

    )

    ;

    ALTER TABLE dbo.TestLedger

    ADD CONSTRAINT PK_TestLedger PRIMARY KEY CLUSTERED (sno)

    ;

    And here's the test data generator. It creates a million rows with expected logical order of transaction dates vs the sno column like it might be in real life. It takes about 22 seconds to build on my laptop.

    --===== Generate the test data and store it into the test table

    WITH

    cteGenTrans AS

    (

    SELECT TOP (1000000)

    tran_date = RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,'2010','2016')+CAST('2010' AS DATETIME)

    ,acct_id = 'Account'+RIGHT(ABS(CHECKSUM(NEWID()))%200+10001,4)

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    )

    ,cteGenData AS

    (

    SELECT row_no = ROW_NUMBER() OVER (PARTITION BY acct_id ORDER BY tran_date)-1

    ,tran_date

    ,acct_id

    ,debit = CASE WHEN (ABS(CHECKSUM(NEWID()))%10+1)<10 THEN RAND(CHECKSUM(NEWID()))*200 ELSE NULL END

    ,credit = CASE WHEN (ABS(CHECKSUM(NEWID()))%15+1)=15 THEN RAND(CHECKSUM(NEWID()))*2000+500 ELSE NULL END

    FROM cteGenTrans

    )

    INSERT INTO dbo.TestLedger

    (sno, row_no, tran_date, acct_id, tran_desc, debit, credit)

    SELECT sno = ROW_NUMBER() OVER (ORDER BY tran_date,row_no)

    ,row_no

    ,tran_date

    ,acct_id

    ,tran_desc = CASE WHEN row_no = 0 THEN 'Opening Balance' ELSE 'Whatever' END

    ,debit = CASE WHEN row_no = 0 THEN NULL ELSE debit END

    ,credit = CASE WHEN row_no = 0 OR debit IS NULL THEN RAND(CHECKSUM(NEWID()))*1500+500 ELSE credit END

    FROM cteGenData

    ORDER BY sno

    ;

    --===== Let's see what we've got

    SELECT * FROM dbo.TestLedger ORDER BY sno

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    I made a small change to ur query as for row no 0 the debit column and credit column will be NULL.

    This data is similar to the data that will be available.

    --===== Generate the test data and store it into the test table

    WITH

    cteGenTrans AS

    (

    SELECT TOP (1000000)

    tran_date = RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,'2010','2016')+CAST('2010' AS DATETIME)

    ,acct_id = 'Account'+RIGHT(ABS(CHECKSUM(NEWID()))%200+10001,4)

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    )

    ,cteGenData AS

    (

    SELECT row_no = ROW_NUMBER() OVER (PARTITION BY acct_id ORDER BY tran_date)-1

    ,tran_date

    ,acct_id

    ,debit = CASE WHEN (ABS(CHECKSUM(NEWID()))%10+1)<10 THEN RAND(CHECKSUM(NEWID()))*200 ELSE NULL END

    ,credit = CASE WHEN (ABS(CHECKSUM(NEWID()))%15+1)=15 THEN RAND(CHECKSUM(NEWID()))*2000+500 ELSE NULL END

    FROM cteGenTrans

    )

    INSERT INTO dbo.TestLedger

    (sno, row_no, tran_date, acct_id, tran_desc, debit, credit,balance)

    SELECT sno = ROW_NUMBER() OVER (ORDER BY tran_date,row_no)

    ,row_no

    ,tran_date

    ,acct_id

    ,tran_desc = CASE WHEN row_no = 0 THEN 'Opening Balance' ELSE 'Whatever' END

    ,debit = CASE WHEN row_no = 0 THEN NULL ELSE debit END

    ,credit = CASE WHEN row_no = 0 THEN NULL Else Case when debit is null then RAND(CHECKSUM(NEWID()))*1500+500 ELSE credit END End

    ,balance=CASE WHEN row_no = 0 THEN RAND(CHECKSUM(NEWID()))*1500+500 ELSE credit END

    FROM cteGenData

    ORDER BY sno

    ;

    --===== Let's see what we've got

    SELECT * FROM dbo.TestLedger ORDER BY sno

    ;

  • Ok, here we go. I've added the changes to the million row test data generator that you pointed out for the 0-row balance thing and I made it so that some of the running balances at the beginning of some accounts come up negative just to prove that it also works with negative numbers. Here's the final test harness I used. I did not add any indexes to the table because we're doing table scans, anyway. Adding a clustered index to sno on the ledger table only shaved about a second of the whole shebang.

    --===== Since we're conditionally dropping test tables,

    -- do this in a nice, safe place that everyone has.

    USE tempdb

    ;

    --===== If the test table already exists, drop it

    -- to make reruns in SSMS easier.

    IF OBJECT_ID('tempdb.dbo.TestLedger','U') IS NOT NULL

    DROP TABLE tempdb.dbo.TestLedger

    ;

    --===== Create the test table

    CREATE TABLE dbo.TestLedger

    (

    sno INT NOT NULL

    ,row_no INT NOT NULL

    ,tran_date DATETIME NOT NULL

    ,acct_id VARCHAR(255) NOT NULL

    ,tran_desc VARCHAR(4000)

    ,debit NUMERIC(28,2)

    ,credit NUMERIC(28,2)

    ,balance NUMERIC(28,2)

    )

    ;

    --===== Generate the test data and store it into the test table

    WITH

    cteGenTrans AS

    (

    SELECT TOP (1000000)

    tran_date = RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,'2010','2016')+CAST('2010' AS DATETIME)

    ,acct_id = 'Account'+RIGHT(ABS(CHECKSUM(NEWID()))%200+10001,4)

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    )

    ,cteGenData AS

    (

    SELECT row_no = ROW_NUMBER() OVER (PARTITION BY acct_id ORDER BY tran_date)-1

    ,tran_date

    ,acct_id

    ,debit = CASE WHEN (ABS(CHECKSUM(NEWID()))%10+1)<10 THEN RAND(CHECKSUM(NEWID()))*200 ELSE NULL END

    ,credit = CASE WHEN (ABS(CHECKSUM(NEWID()))%60+1)=60 THEN RAND(CHECKSUM(NEWID()))*2000+500 ELSE NULL END

    FROM cteGenTrans

    )

    INSERT INTO dbo.TestLedger

    (sno, row_no, tran_date, acct_id, tran_desc, debit, credit, balance)

    SELECT sno = ROW_NUMBER() OVER (ORDER BY tran_date,row_no)

    ,row_no

    ,tran_date

    ,acct_id

    ,tran_desc = CASE WHEN row_no = 0 THEN 'Opening Balance' ELSE 'Whatever' END

    ,debit = CASE WHEN row_no = 0 THEN NULL ELSE debit END

    ,credit = CASE WHEN row_no = 0 THEN NULL ELSE CASE WHEN debit IS NULL THEN RAND(CHECKSUM(NEWID()))*1500+500 ELSE credit END END

    ,balance = CASE WHEN row_no = 0 THEN RAND(CHECKSUM(NEWID()))*1500+500 ELSE NULL END

    FROM cteGenData

    ORDER BY sno --This doesn't actually matter but is here to simulate the real heap.

    ;

    Here's the code that does the running balance. Despite the fact that it's basically copying the important columns of the 1 million row table and then doing a join to update the million rows in the original table, it still only takes 9 or 10 seconds to do the account-grouped running totals on all 1 million rows.

    As is usual with me, the explanation is in the comments in the code.

    --===== If the work table already exists, drop it

    -- to make reruns in SSMS easier.

    IF OBJECT_ID('tempdb..#WorkTable','U') IS NOT NULL

    DROP TABLE #WorkTable

    ;

    WITH

    ctePreCalc AS

    ( --=== This just precalculates the transaction amount on each row so we can sort by the amount.

    -- Banks always regard credits before debits on datetime ties.

    SELECT sno

    ,row_no

    ,acct_id

    ,tran_date

    ,Amount = ISNULL(credit,0)-ISNULL(debit,0)

    ,Balance

    FROM dbo.TestLedger

    ) --=== This does the actual sort to create an incrementing "safety" number to make sure that

    -- calculations are actually process in the correct order.

    SELECT SafetyNum = ISNULL(ROW_NUMBER() OVER (ORDER BY acct_id, tran_date, Amount DESC),0)

    ,sno

    ,row_no

    ,acct_id

    ,tran_date

    ,Amount

    ,Balance

    INTO #WorkTable

    FROM ctePreCalc

    ;

    --===== The addition of this clustered index is critical.

    ALTER TABLE #WorkTable

    ADD PRIMARY KEY CLUSTERED (SafetyNum)

    ;

    --===== Create some obviously named variables

    DECLARE @SafetyNum INT

    ,@PrevBal NUMERIC(28,2)

    ;

    --===== Preset those variables.

    -- Yes, I know. Could have done this in the DECLARE but wanted 2005 users

    -- to be able to run this without having to make changes, as well.

    SELECT @SafetyNum = 1

    ,@PrevBal = 0

    ;

    --===== This is a "Quirky Update". There are a fair number of rules that we

    -- apply to this to force it to work and report if MS ever does something

    -- to break it. Do NOT remove or change the table hints or the MAXDOP

    -- because we have to force the clustered index to be used and, since

    -- running totals are serial in nature, we have to force serial, single

    -- CPU ops with MAXDOP 1. This works just like any running total in C#

    -- or VB might work except you can't see the loop (Pseudo-Cursor) that

    -- the UPDATE makes in the background.

    UPDATE tgt

    SET @PrevBal = balance = CASE WHEN SafetyNum = @SafetyNum

    THEN CASE

    WHEN row_no = 0 THEN balance

    ELSE @PrevBal + Amount

    END

    ELSE 'Out of order condition'

    END

    ,@SafetyNum = @SafetyNum + 1

    FROM #WorkTable tgt WITH(TABLOCKX,INDEX(1))

    OPTION (MAXDOP 1)

    ;

    --===== Once we've made it through the running balance calculation without error,

    -- this does the final update to the originnal Ledger table.

    UPDATE ldgr

    SET balance = wt.balance

    FROM dbo.TestLedger ldgr

    JOIN #WorkTable wt

    ON ldgr.sno = wt.sno

    ;

    It only takes about 85ms to process 9500 rows.

    Shifting gears a bit, I'm not one to trust someone else's row numbers so I didn't order the running total by row_no. A bank would always do it by date and time. You might be able to guarantee the row_no column for each account is temporally correct but I couldn't take that chance for you. To be honest, I suggest you don't take that chance either.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    Sorry for late reply and thanks a lot for your reply. Your solution worked out perfectly which helped reduce from hours to seconds. Thanks a lot.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply