Help with SQL algorithm to process transactions

  • There is absolutely no need to create a Tally table in Master. If you don't want to have that very useful tool in every database, you should make a Utililty database and then create a synonym in the other databases. If you're using an older version of SQL Server, a pass through view will suffice as a synonym.

    I quite agree that creating user tables in any of the system database is not a best practice as someone will then start adding application specific tables and then update these tables with business data. As master is always in simple recovery mode, these updates are not recoverable.

    I make one exception for the Tally table ( sometimes also named Numbers or Sequences,depending on the author ), as it is populated once and is never updated. I have been creating this table in the master database since 1993 under Sybase and since 1998 under SQL Server and have never had any problems.

    Do you know of any side-effect of creating tally in master ?

    SQL = Scarcely Qualifies as a Language

  • Carl Federl (6/22/2009)


    Do you know of any side-effect of creating tally in master ?

    Yep... other people see it and think it's ok. Sometimes, ya just gotta set the example.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Carl, see attached. Didn't change your logic of how to create the FIFO, just the logic of doing the running totals. Basically, it cuts the time in half when displaying to the screen and cuts the time down to 25% when writing the results to a new table.

    I did not verify if your logic is correct or not because I wanted to do this exactly the same way to show how bad the triangular joins comparatively are. The thing you can't see is that as rowcounts increase, the triangular joins grow exponentially worse.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Using my preferred measure of total_worker_time as reported by sys.dm_exec_query_stats:

    Triangular join: 10,218,749µs (~10.2 sec)

    Your method:

    Receipts running total: 41,015µs (~0.04 sec)

    Shipments running total: 240,234µs (~0.24 sec)

    Final SELECT: 4,425,781µs (~4.43 sec)

    Total 4,707,030µs (~4.7 sec)

    If the final SELECT is changed to write the results to a table, that figure drops from 4,425,781µs to 706,054 (~0.7 sec) for a grand total of 0.98 seconds.

    Awesome work.

    Paul

  • First of all, I wanted to thank all of you for the help. I reused the original portion of the script posted by Lutz in my real life project but I could not figure out an "elegant" way to pick from the transactions within a quarter without using the worst structure possible, meaning cursors. I have tested many different scenarios and so far it works. You probably are going to wonder about the script below and have a "WTF" (pardon me for the expression) moment but here it is. Feel free to pick on it 😉

    Again, the comments column is just there to help me debug the code. Thanks again for the help.

    declare @balancequarter table

    (

    quarterid int

    ,amount int

    )

    declare @transaction table

    (

    transactionid int

    ,quarterid int

    ,amount int

    )

    declare @transactionfinal table

    (

    transactionid int

    ,quarterid int

    ,amount int

    ,comments varchar(100)

    )

    insert into @transaction select 1, 1, 5

    insert into @transaction select 2, 1, 3

    insert into @transaction select 3, 1, 7

    insert into @transaction select 4, 1, -5

    insert into @transaction select 5, 1, 2

    insert into @transaction select 100, 2, 10

    insert into @transaction select 101, 2, 16

    insert into @transaction select 102, 2, -6

    insert into @transaction select 103, 2, 1

    insert into @balancequarter select quarterid, sum(amount) from @transaction group by quarterid

    -- debug code

    select * from @balancequarter

    --select * from @transaction

    declare @amounttoload int

    declare @totalbalance int

    set @amounttoload = 12

    set @totalbalance = (select sum(amount) from @balancequarter)

    declare @quarterid int

    declare @balance_currentquarter int

    declare @remainingbalance int

    set @remainingbalance = @amounttoload

    -- outer loop - the quarters

    declare quarter_cursor cursor for

    select quarterid, amount from @balancequarter order by quarterid

    open quarter_cursor

    fetch next from quarter_cursor into @quarterid, @balance_currentquarter

    while @@fetch_status = 0

    begin

    --select 'quarter', @quarterid, @balance_currentquarter

    declare @transactionid int

    declare @transactionamount int

    declare @runningtotal_quarter int

    set @runningtotal_quarter = 0

    declare innerloop_cursor cursor for

    select transactionid, amount from @transaction where quarterid = @quarterid and amount > 0 order by transactionid

    open innerloop_cursor

    fetch next from innerloop_cursor into @transactionid, @transactionamount

    while @@fetch_status = 0

    begin

    --select 'transaction', @transactionid, @transactionamount, @runningtotal_quarter, @runningtotal, @quarterid

    --select @remainingbalance, @transactionamount, @runningtotal_quarter + @transactionamount, @balance_currentquarter

    if(@remainingbalance > 0 and ((@runningtotal_quarter + @transactionamount) >= @balance_currentquarter))

    begin

    if(@remainingbalance > @transactionamount)

    begin

    if((@balance_currentquarter - @runningtotal_quarter) < @transactionamount)

    begin

    insert into @transactionfinal select @transactionid, @quarterid, (@balance_currentquarter - @runningtotal_quarter), 'done with this quarter... full transaction. remaining balance up to the quarter balance only.'

    set @remainingbalance = @remainingbalance - (@balance_currentquarter - @runningtotal_quarter)

    set @runningtotal_quarter = @runningtotal_quarter + (@balance_currentquarter - @runningtotal_quarter)

    end

    else

    begin

    insert into @transactionfinal select @transactionid, @quarterid, @transactionamount, 'done with this quarter... full transaction'

    set @remainingbalance = @remainingbalance - @transactionamount

    set @runningtotal_quarter = @runningtotal_quarter + @transactionamount

    end

    end

    else

    if(@remainingbalance @remainingbalance)

    begin

    insert into @transactionfinal select @transactionid, @quarterid, @remainingbalance , 'done with this quarter... partial transaction. remaining balance only.'

    set @remainingbalance = 0

    set @runningtotal_quarter = @runningtotal_quarter + @remainingbalance

    end

    else

    begin

    insert into @transactionfinal select @transactionid, @quarterid, (@balance_currentquarter - @runningtotal_quarter) , 'done with this quarter... partial transaction. remaining balance up to the quarter balance only.'

    set @remainingbalance = @remainingbalance - (@balance_currentquarter - @runningtotal_quarter)

    set @runningtotal_quarter = @balance_currentquarter

    end

    end

    break

    fetch next from quarter_cursor into @quarterid, @balance_currentquarter

    end

    else

    if(@remainingbalance > 0 and ((@runningtotal_quarter + @transactionamount) @transactionamount)

    begin

    insert into @transactionfinal select @transactionid, @quarterid, @transactionamount, 'keep going...'

    set @remainingbalance = @remainingbalance - @transactionamount

    set @runningtotal_quarter = @runningtotal_quarter + @transactionamount

    end

    else

    if(@remainingbalance <= @transactionamount)

    begin

    insert into @transactionfinal select @transactionid, @quarterid, @remainingbalance, 'done...'

    set @remainingbalance = 0

    set @runningtotal_quarter = @runningtotal_quarter + @remainingbalance

    fetch next from quarter_cursor into @quarterid, @balance_currentquarter

    end

    end

    fetch next from innerloop_cursor into @transactionid, @transactionamount

    end

    close innerloop_cursor

    deallocate innerloop_cursor

    fetch next from quarter_cursor into @quarterid, @balance_currentquarter

    end

    close quarter_cursor

    deallocate quarter_cursor

    select * from @transactionfinal

    "Any fool can write code that a computer can understand. Good programmers write
    code that humans can understand." -Martin Fowler et al, Refactoring: Improving the Design of Existing Code, 1999

  • Eichpeel (6/23/2009)


    First of all, I wanted to thank all of you for the help. I reused the original portion of the script posted by Lutz in my real life project but I could not figure out an "elegant" way to pick from the transactions within a quarter without using the worst structure possible, meaning cursors. I have tested many different scenarios and so far it works. You probably are going to wonder about the script below and have a "WTF" (pardon me for the expression) moment but here it is. Feel free to pick on it 😉

    Again, the comments column is just there to help me debug the code. Thanks again for the help.

    What did you do to the clean some 20 lines code I posted based on Jeff's article???????? :crying:

    Is it really necessary to blow it up to over hundred lines filled up with cursor and IF ELSE stuff???????? :sick:

    You provided almost the same sample data as before. Running your code it will get almost the identical result as the code I provided (except a different wording in the comments column).

    In order to make some progress please answer the following questions:

    1) What is the real issue you're struggling with?

    2) What do you mean by "to pick from the transactions within a quarter"? Example, please.

    3) Is it possible that you cannot change the table structure in your production system?

    4) Or is it possible that the data in your production system are in a different order than you provided with your sample data?

    Please answer the questions above so we're able to help you to get a more efficient code than you have right now.

    Note: I just read your signature below your sample code and started scratching my head... 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Lutz,

    I understand your concerns. I ran the original script you posted with the following set of transactions and found some issues related to transactions showing up in the table even though the max_amount

    insert into #transaction select 1, 1, 5,0

    insert into #transaction select 2, 1, 3,0

    insert into #transaction select 3, 1, 7,0

    insert into #transaction select 4, 1, -5,0

    insert into #transaction select 5, 1, 2,0

    insert into #transaction select 100, 2, 10,0

    insert into #transaction select 101, 2, 16,0

    insert into #transaction select 102, 2, -6,0

    insert into #transaction select 103, 2, 1,0

    it gives the following output.

    transactionidquarteridamountruntotalmax_amountusedtransactionamountcomment

    1155125fully used

    2138123fully used

    31715124partial transaction

    51212122partial transaction -- we have already maxed out the quarter with the previous transaction (12)

    100210102110fully used

    101216262111partial transaction

    1032121211partial transaction -- we have already maxed out the quarter with the previous transaction (21)

    Can you please explain why the 2 highlighted transactions are showing up even though the balance for the quarter has already been reached? Is there any way to exclude them or have a running total of the usedtransactionamount once they go over the balance for the quarter?

    I apologize if I am a slow to catch up.

    "Any fool can write code that a computer can understand. Good programmers write
    code that humans can understand." -Martin Fowler et al, Refactoring: Improving the Design of Existing Code, 1999

  • Sorry about the delay,but the NTFS decided to misbehave on my business laptop and spent yesterday getting windows re-installed.

    First, the term "Triangle Join" is a wonderfull term that easily conveys resources needs.

    The running totals technique you are using was previously used in the early 90s with Sybase, but stopped working when Sybase System X was released in about 1995 and the "@variable = column + @variable" syntax was no longer supported. For this reason, I do not use these types of undocumented features and also try to avoid proprietary features that are not in the ISO SQL standard.

    Below is the SQL using the ISO SQL standard that has been tested with SQLAnywhere 11. Run time was approximately the same as the variable solution. Under SQL Server, you can get an idea of the resource needs by changing "SUM(WharehouseProductReceipt.WharehouseProductReceiptQty)" to "RANK()". I found that the CPU times are about 30ms for a RANK and about 150 ms for the update modified to generate a RANK.

    SELECTWharehouseProductReceipt.WharehouseID

    ,WharehouseProductReceipt.ProductId

    ,WharehouseProductReceipt.WharehouseProductReceiptTs

    ,WharehouseProductReceipt.WharehouseProductReceiptQty

    ,SUM( WharehouseProductReceipt.WharehouseProductReceiptQty)

    OVER (PARTITION BY

    WharehouseProductReceipt.WharehouseID

    ,WharehouseProductReceipt.ProductId

    ORDER BY WharehouseProductReceipt.WharehouseID

    ,WharehouseProductReceipt.ProductId

    ,WharehouseProductReceipt.WharehouseProductReceiptTs

    )AS WharehouseProductReceiptQtyRT

    FROMWharehouseProductReceipt

    SQL = Scarcely Qualifies as a Language

  • Hi Eichpeel,

    the output is caused by the following sequence (example quarterid = 1):

    transactionidquarteridamountruntotalmax_amountusedtransactionamount

    1155125

    2138123

    31715124

    41-51012-5

    51212122

    Transactionid 3 exceeds the limit, and therefore is marked as "partial transaction".

    Note: Due to the WHERE condition all negative amounts are excluded in the script I provided originally, so transactionid (4) isn't displayed.

    Your statement "we have already maxed out the quarter with the previous transaction" for transactionid=5 is not true since the previous transactionid (4) reduces the amount back to 10, which is valid (or at least not marked as invalid in your original requirement).

    Within your original code there was no example on how to handle the situation where the amount is completely used.

    If you want to show 'fully used' when runtotal = max_amount simply change the CASE conditions from

    WHEN runtotal < bq.amount

    to

    WHEN runtotal 0 -- add just positive values

    THEN @PrevGrpBal + Amount

    WHEN quarterid = @PrevQrtID and Amount <=0 -- don't change running total for negative values

    THEN @PrevGrpBal

    ELSE Amount -- Restarts total at "0 + current amount"

    END

    --===== "Anchor" and provides for "account change detection"

    @PrevQrtID = quarterid

    FROM #transaction WITH (INDEX(IX_temptransaction_transactionid),TABLOCKX)

    -- end source

    SELECT t.*,

    bq.amount AS max_amount,

    CASE WHEN runtotal <= bq.amount THEN t.amount ELSE t.amount + bq.amount - runtotal END AS usedtransactionamount,

    CASE

    WHEN runtotal bq.amount AND t.amount + bq.amount > runtotal THEN 'partial transaction' ELSE 'already exceeded'

    END AS comment

    from #balancequarter bq inner join #transaction t

    on t.quarterid = bq.quarterid

    WHERE t.amount >0[/code]

    Including the addtl. condition to set the comment to 'already exceeded' you would get:

    transactionidquarteridamountruntotalmax_amountusedtransactionamountcomment

    1155125fully used

    2138123fully used

    31715124partial transaction

    5121712-3already exceeded

    100210102110fully used

    101216262111partial transaction

    103212721-5already exceeded

    As you can see it's very important to pay close attention when setting up sample data: If those data cannot be used to verify all conditions (and not all conditions are defined) you'll end up with a semi-functional code from your perspective, but with a full functional code from the perspective of the person willing to help you...

    If you need further assistance please let us know.

    Note: I still didn't use any cursor...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Carl Federl (6/23/2009)


    Sorry about the delay,but the NTFS decided to misbehave on my business laptop and spent yesterday getting windows re-installed.

    First, the term "Triangle Join" is a wonderfull term that easily conveys resources needs.

    The running totals technique you are using was previously used in the early 90s with Sybase, but stopped working when Sybase System X was released in about 1995 and the "@variable = column + @variable" syntax was no longer supported. For this reason, I do not use these types of undocumented features and also try to avoid proprietary features that are not in the ISO SQL standard.

    First of all this:

    UPDATE tablename SET

    @variablename = [i]columnname[/i] = somevalue

    is a documented feature of Microsoft SQL Server.

    Second, you may choose not to use proprietary features and only use ISO Standard SQL in you r coding, but that simply limits what you are capable of accomplishing in MS SQL Server.

    I work in a pure MS SQL Server environment, and I will use what ever tricks I can to get the most out of my SQL Server systems. If we find ourselves migrating to a different environment, I'll worry then about porting any database code at that time, but not before.

  • Paul White (6/23/2009)


    Jeff,

    Using my preferred measure of total_worker_time as reported by sys.dm_exec_query_stats:

    Triangular join: 10,218,749µs (~10.2 sec)

    Your method:

    Receipts running total: 41,015µs (~0.04 sec)

    Shipments running total: 240,234µs (~0.24 sec)

    Final SELECT: 4,425,781µs (~4.43 sec)

    Total 4,707,030µs (~4.7 sec)

    If the final SELECT is changed to write the results to a table, that figure drops from 4,425,781µs to 706,054 (~0.7 sec) for a grand total of 0.98 seconds.

    Awesome work.

    Paul

    Whoa! Thanks for the awesome testing and feedback!!!!! 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Lutz,

    This looks perfect! I will make sure in the future to try listing all possible scenarios so that the community can test them all. I really love the fact that there is no cursor left. Thank you very much.

    "Any fool can write code that a computer can understand. Good programmers write
    code that humans can understand." -Martin Fowler et al, Refactoring: Improving the Design of Existing Code, 1999

  • For this reason, I do not use these types of undocumented features

    Look up UPDATE in BOL... the SET @variable = column = expression IS documented.

    and also try to avoid proprietary features that are not in the ISO SQL standard.

    Heh... if you want to believe in the myth of truly portable code, you go right ahead:-P . I'm not going to be one of those folks that won't shift into 6th gear just because most cars only have 4 or 5. 😉

    If you're going to follow the ISO standards, then I recommend you go back to using cursors for running totals because they're faster than a triangular join and a pot wad more scalable.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Lynn Pettis (6/23/2009)


    First of all this:

    UPDATE tablename SET

    @variablename = [i]columnname[/i] = somevalue

    is a documented feature of Microsoft SQL Server. ...

    Hi Lynn,

    would you mind quoting the post you refer to (especially on a thread having two parallel discussions)?

    I've been rechecking my code from my post right above yours in order to figure out what I've done to get a reply like your's until I realized it was the post before mine you referred to... :doze:

    Edit: Thank you, Lynn! Appreciate it. 😀



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Eichpeel (6/23/2009)


    "Any fool can write code that a computer can understand. Good programmers write

    code that humans can understand." -Martin Fowler et al, Refactoring: Improving the Design of Existing Code, 1999

    I just noticed your tagline above... totally awesome and I'm a strong believer in that. Didn't know it was Fowler that said it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply