While Loop Much Slower than LOCAL FAST_FORWARD READ_ONLY Cursor

  • So after spending some time rewriting a routine that used Cursors to use While loops, I am having to go back and change them back to Cursors because the While loops are much, much slower. As an example a report that took 8 minutes to complete now takes 33 minutes. The sproc behind the report went from just under 8 to 30 minutes so it is all in the sproc.

    I can't really post the code because there is a lot going on but basically each row has to be compared to the previous row in order to do some calculations to get a daily "performance" of Assets/Accounts by some grouping. Hence the need to do a row by row solution. In using the while loops I made sure there were appropriate indexes for each of the where clauses used in the looping and the performance calculation code inside the while loops is basically identical to the code from the cursor loops. The result sets that are being looped are pretty large, sometimes a couple hundred thousand rows. I try to avoid cursors at all costs but I seem to have found a case where I can get around using them but at a significant performance cost. Wondering if anyone else has had similar situations.

    Robb

  • Actually replacing a c.u.r.s.o.r. with a loop is like choosing pest over cholera...

    Based on what you describe you don't need a RBAR solution.

    I'm confident Jeff Modens "Running Total" article[/url] will show you an alternative way that will perform waaayyyyyy better than what you currently have.

    If you need some assistance to speed up your code we'll be glad to assist you. But this would require you to provide some sample data, expected result and logic description, at least for the "big picture / basic concept".



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Robb,

    A while loop is no better, and frequently worse, than a c.u.r.s.o.r.

    From what you describe, Jeff's article (that Lutz mentioned above) should handle your situation. I've seen solutions based on > 1 million rows run in seconds.... so it can be done!

    If you need more assistance, just remember that the people that help out here are all un-paid volunteers. Providing the DDL scripts (CREATE TABLE, CREATE INDEX, etc.) for the tables affected, and INSERT statements to put some test data into those tables that shows your problem will go a long way in getting people to look at your issue and help you out. Please include code for what you have already tried. Don't forget to include what your expected results should be, based on the sample data provided. As a bonus to you, you will get tested code back. For more details on how to get all of this into your post, please look at the first link in my signature.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Robb Melancon (5/11/2010)


    basically each row has to be compared to the previous row in order to do some calculations to get a daily "performance" of Assets/Accounts by some grouping

    Unless the data on the previous row involves the data that you just calculated on the previous row, you might just be able to get away with a simple self join. However, there's no way of knowing for us because we have no example to go by. If you could create a simple 10 or 12 row example using "artificial" data with the real formula requirements, you'd probably be amazed at how many people can help you with some very high speed solutions.

    Now, don't just post data like everyone else does... take the time to do it the way the article at the first link in my signature line below shows.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for the replies, I will look into the "quirky update" method as a possible solution. The calculations are more involved than simple summing but I think it may work. I didn't post any pseudo code as it will take some time to strip it down and I wasn't sure how "active" this board was. I haven't really used it as a reference until recently. Now I know.

    As I try this method I will post updates with the results and/or more questions.

    Thanks again for the quick replies and for all of your time, if this works I owe you.

    Robb

  • Robb Melancon (5/11/2010)


    ... The calculations are more involved than simple summing ...

    Robb

    if you can do it to one row, even if you are doing it in multiple steps now, you can do it too lots of rows more efficiently.

    show us some details, I'm sure we can help;

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Robb Melancon (5/11/2010)


    Thanks for the replies, I will look into the "quirky update" method as a possible solution. The calculations are more involved than simple summing but I think it may work. I didn't post any pseudo code as it will take some time to strip it down and I wasn't sure how "active" this board was. I haven't really used it as a reference until recently. Now I know.

    As I try this method I will post updates with the results and/or more questions.

    Thanks again for the quick replies and for all of your time, if this works I owe you.

    Robb

    If you do end up using the "Quirky" Update, make sure that you follow ALL the rules listed near the end of the article. The rules are very simple but very strict.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok so everything is working... almost. The first problem I am running into is that I need to reset some values when an asset changes. Should make more sense when you see the example but basically I have calculations that are being done for assets/stocks over a period of times. So in the example I have 5 days of data for 4 different stocks, when the stock changes I need to reset some numbers back to 0. I am not able to flag when my asset changes. When run, status is always 'Same' when I am expecting 'Changed' when it changes from one asset to another. Here is the code

    create table #DMVRunningPerformance (

    LongPosition_bt bit null,

    AssetID_in int null,

    Symbol_vc varchar(255),

    DayID_in int,

    Status_vc varchar(255) null)

    CREATE CLUSTERED INDEX [12345asdlckj] ON #DMVRunningPerformance

    (

    [LongPosition_bt] ASC,

    [AssetID_in] ASC,

    [DayID_in] DESC

    )

    WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75000,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75001,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75002,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75003,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75004,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75000,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75001,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75002,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75003,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75004,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75000,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75001,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75002,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75003,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75004,NULL)

    declare @AssetID_in int=-1,

    @NDAssetID_in int=-1,

    @LongPosition_bt bit=0,

    @Status_vc varchar(255)=NULL

    update #DMVRunningPerformance

    set

    Status_vc=CASE WHEN @NDAssetID_in<>AssetID_in THEN 'Changed' ELSE 'Same' END,

    @AssetID_in=AssetID_in,

    @NDAssetID_in=AssetID_in

    FROM #DMVRunningPerformance OPTION (MAXDOP 1)

    select * from #DMVRunningPerformance

    drop table #DMVRunningPerformance

  • Robb Melancon (5/13/2010)


    Ok so everything is working... almost. The first problem I am running into is that I need to reset some values when an asset changes. Should make more sense when you see the example but basically I have calculations that are being done for assets/stocks over a period of times. So in the example I have 5 days of data for 4 different stocks, when the stock changes I need to reset some numbers back to 0. I am not able to flag when my asset changes. When run, status is always 'Same' when I am expecting 'Changed' when it changes from one asset to another. Here is the code

    create table #DMVRunningPerformance (

    LongPosition_bt bit null,

    AssetID_in int null,

    Symbol_vc varchar(255),

    DayID_in int,

    Status_vc varchar(255) null)

    CREATE CLUSTERED INDEX [12345asdlckj] ON #DMVRunningPerformance

    (

    [LongPosition_bt] ASC,

    [AssetID_in] ASC,

    [DayID_in] DESC

    )

    WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75000,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75001,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75002,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75003,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75004,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75000,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75001,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75002,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75003,NULL)

    insert into #DMVRunningPerformance

    values (1,101,'MSFT',75004,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75000,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75001,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75002,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75003,NULL)

    insert into #DMVRunningPerformance

    values (1,102,'AAPL',75004,NULL)

    declare @AssetID_in int=-1,

    @NDAssetID_in int=-1,

    @LongPosition_bt bit=0,

    @Status_vc varchar(255)=NULL

    update #DMVRunningPerformance

    set

    Status_vc=CASE WHEN @NDAssetID_in<>AssetID_in THEN 'Changed' ELSE 'Same' END,

    @AssetID_in=AssetID_in,

    @NDAssetID_in=AssetID_in

    FROM #DMVRunningPerformance OPTION (MAXDOP 1)

    select * from #DMVRunningPerformance

    drop table #DMVRunningPerformance

    You did a good job in implementing this.

    Change this one line in your update statement:

    @Status_vc = Status_vc=CASE WHEN @NDAssetID_in <> AssetID_in THEN 'Changed'

    ELSE 'Same' END,

    or, if you want to stick with two-part updates, change it to these two lines:

    @Status_vc = CASE WHEN @NDAssetID_in <> AssetID_in THEN 'Changed'

    ELSE 'Same' END,

    Status_vc = @Status_vc,

    Either one will get the results that you are looking for.

    Note that your update statement is okay for a temp table... BUT if you are going to be running this against a permanent table, then you need to change your from clause to:

    FROM #DMVRunningPerformance WITH (TABLOCKX)

    OPTION (MAXDOP 1)

    I always include that "WITH (TABLOCKX)" - just so that if I do convert it to a permanent table, I don't forget it. As Jeff mentioned, ALL of those rules at the end of the article need to be followed.

    Edit: @jeff: I'm actually learning these rules... first time I didn't have to look up the TABLOCKX!

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Thanks, that did it.

    So in the end the QUpdate is about 75% faster than the Cursor. So far my worst case cursors are only taking a couple minutes (there are other things that happen before the performance calculation that take the bulk of the reporting time) so the real impact is shaving about 1:30 off of report generation. So what other benefits do I get by getting rid of the cursors? I know scaleability concerns but I'm not sure that applies to our infastructure.

    Thanks again for all the help, if nothing else the code is a lot cleaner.

  • Robb Melancon (5/13/2010)


    So in the end the QUpdate is about 75% faster than the Cursor.

    Only 75%? I would have expected much better... closer to 90%. What else is your code doing? I bet we can make it faster!!!

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Here's the actual udpate statement:

    update #DMVRunningPerformance

    set

    @NewGroup_bt=CASE WHEN @AssetID_in<>AssetID_in or @LongPosition_bt<>LongPosition_bt THEN 1 ELSE 0 END,

    @DailyPerformanceNet_dc =CASE WHEN PerformanceMarketValue_fl = 0 or @NewGroup_bt=1 THEN 0

    ELSE (@NDPerformanceMarketValue_fl-@NDCashFlowNet_fl-PerformanceMarketValue_fl)

    /PerformanceMarketValue_fl

    END,

    @DailyPerformanceGross_dc = CASE WHEN PerformanceMarketValue_fl = 0 or @NewGroup_bt=1 THEN 0

    ELSE (@NDPerformanceMarketValue_fl-@NDCashFlowGross_fl-PerformanceMarketValue_fl)

    /PerformanceMarketValue_fl

    END,

    @DailyPerformanceNetGL_dc = CASE WHEN (PerformanceMarketValue_fl-IntradayGainLoss_fl) = 0 or @NewGroup_bt=1 THEN 0

    ELSE (@NDPerformanceMarketValue_fl-@NDIntradayGainLoss_fl-@NDCashFlowNet_fl-(PerformanceMarketValue_fl-IntradayGainLoss_fl))

    /(PerformanceMarketValue_fl-IntradayGainLoss_fl)

    END,

    @DailyPerformanceGrossGL_dc = CASE WHEN (PerformanceMarketValue_fl-IntradayGainLoss_fl) = 0 or @NewGroup_bt=1 THEN 0

    ELSE (@NDPerformanceMarketValue_fl-@NDIntradayGainLoss_fl-@NDCashFlowGross_fl-(PerformanceMarketValue_fl-IntradayGainLoss_fl))

    /(PerformanceMarketValue_fl-IntradayGainLoss_fl)

    END,

    @RunningPerformanceNet_dc = CASE WHEN @NewGroup_bt=1 THEN 0

    ELSE ((1 + @RunningPerformanceNet_dc) * (1 + @DailyPerformanceNet_dc)) - 1 END,

    @RunningPerformanceGross_dc = CASE WHEN @NewGroup_bt=1 THEN 0

    ELSE ((1 + @RunningPerformanceGross_dc) * (1 + @DailyPerformanceGross_dc)) - 1 END,

    @RunningPerformanceNetGL_dc = CASE WHEN @NewGroup_bt=1 THEN 0

    ELSE ((1 + @RunningPerformanceNetGL_dc) * (1 + @DailyPerformanceNetGL_dc)) - 1 END,

    @RunningPerformanceGrossGL_dc = CASE WHEN @NewGroup_bt=1 THEN 0

    ELSE ((1 + @RunningPerformanceGrossGL_dc) * (1 + @DailyPerformanceGrossGL_dc)) - 1 END,

    @NDPerformanceMarketValue_fl=PerformanceMarketValue_fl,

    @NDCashFlowNet_fl=CashFlowNet_fl,

    @NDCashFlowGross_fl=CashFlowGross_fl,

    @NDIntradayGainLoss_fl=CASE WHEN abs(IntradayGainLoss_fl) < 0.01 THEN 0 ELSE IntradayGainLoss_fl END,

    AccruedIncome_fl=CASE WHEN (AccruedIncome_fl>0 and PerformanceMarketValue_fl < AccruedIncome_fl)

    THEN -- rlm - erroneous data condition that causes performance to bomb

    0 ELSE AccruedIncome_fl

    END,

    AccruedIncomeOnCash_fl=CASE WHEN (AccruedIncomeOnCash_fl>0 and PerformanceMarketValue_fl < AccruedIncomeOnCash_fl)

    THEN 0 ELSE AccruedIncomeOnCash_fl

    END,

    DailyPerformanceNet_dc=@DailyPerformanceNet_dc,

    DailyPerformanceGross_dc=@DailyPerformanceGross_dc,

    DailyPerformanceNetGL_dc=@DailyPerformanceNetGL_dc,

    DailyPerformanceGrossGL_dc=@DailyPerformanceGrossGL_dc,

    RunningPerformanceGross_dc=@RunningPerformanceGross_dc,

    RunningPerformanceGrossGL_dc=@RunningPerformanceGrossGL_dc,

    RunningPerformanceNet_dc=@RunningPerformanceNet_dc,

    RunningPerformanceNetGL_dc=@RunningPerformanceNetGL_dc,

    @AssetID_in=AssetID_in,

    @LongPosition_bt=LongPosition_bt

    FROM #DMVRunningPerformance OPTION (MAXDOP 1)

  • So there is one more problem I am having with the update, my performance numbers are a day off. So here is an example of the results:

    HHDayID_inCFQtyCosstMVDailyPerfRunningPerf

    443275694013313.6960201048.898700

    443275693013313.696143.86201048.898700

    443275692013299.8420200841.56580.001032320.00103232

    443275691013299.84224.98200321.70430.0025951330.003630133

    443275690013299.2495.4201120.0467-0.003969482-0.000353759

    443275689013299.1220201779.8805-0.003270068-0.00362267

    443275688013299.1220201651.39170.000637183-0.002987795

    443275687013299.1220201651.39170-0.002987795

    Sorry for the format, wasn't sure the best way to post, but hopefully you can see the issue. The first two rows of daily and running performance are 0. There should only be a 0 in the first row and then numbers, ie DayID 75694 should have 0, 0 (daily, running) then 74693 should have 0.00103232, 0.00103232, 74692 should have 0.002595133, 0.003630133 etc. So the performance numbers need to be shifted up a row. All the over values Market Value (MV) Qty are on the right dayid row. Was thinking of doing a second update to do the shift but I don't think it can be done with the clustered index ordering. Any ideas?

    Thanks in advance,

    Robb

  • So here is code to create the results that I'm referring to:

    create table #DMVRunningPerformance (

    LongPosition_bt bit null,

    AssetID_in int null,

    Symbol_vc varchar(255),

    DayID_in int,

    Status_vc varchar(255) null,

    MV float,

    DailyPerf float null,

    RunningPerf float null)

    CREATE CLUSTERED INDEX [12345asdlckj] ON #DMVRunningPerformance

    (

    [LongPosition_bt] ASC,

    [AssetID_in] ASC,

    [DayID_in] DESC

    )

    WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75007,NULL,201048.8987,NULL,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75006,NULL,201048.8987,NULL,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75005,NULL,200841.5658,NULL,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75004,NULL,200321.7043,NULL,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75003,NULL,201120.0467,NULL,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75002,NULL,201779.8805,NULL,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75001,NULL,201651.3917,NULL,NULL)

    insert into #DMVRunningPerformance

    values (1,100,'IBM',75000,NULL,201651.3917,NULL,NULL)

    declare @AssetID_in int=-1,

    @NDAssetID_in int=-1,

    @LongPosition_bt bit=0,

    @Status_vc varchar(255)=NULL,

    @NewGroup_bt bit=1,

    @DailyPerformance float,

    @RunningPerformance float,

    @NDMarketValue float

    update #DMVRunningPerformance

    set

    @DailyPerformance=CASE WHEN @NewGroup_bt=1 THEN 0 ELSE

    (@NDMarketValue-MV)/MV END,

    @RunningPerformance=CASE WHEN @NewGroup_bt=1 THEN 0 ELSE

    ((1 + @RunningPerformance) * (1 + @DailyPerformance)) - 1 END,

    DailyPerf=@DailyPerformance,

    RunningPerf=@RunningPerformance,

    @NDMarketValue=MV,

    @NewGroup_bt=0,

    @AssetID_in=AssetID_in,

    @NDAssetID_in=AssetID_in

    FROM #DMVRunningPerformance OPTION (MAXDOP 1)

    select * from #DMVRunningPerformance

    drop table #DMVRunningPerformance

    So the issue is that I want to compare day 75007 to 75006 and store the results in 75007, then compare 75006 to 75005 and store in 75006. What is happening is that 75007 is compared to 75006 but stored in 75006 instead of 75007. So the performance results are correct but a day off. Trying to avoid having to roll through the results to shift the performance up a row.

  • So I think I have a solution that involves dropping the CI and creating a new one in DayID ASC order then just copying values from row 1 to 2, 2 to 3 etc. Not sure it's the best way but I can't think of any other way to shift the values up. Anyway, here is some code and it is not doing at all what I am expecting. I've tried to strip it down to a simple test case and I must be missing something fundamental because it doesn't do anything. I'm expecting daily and running values to move up a row.

    create table #DMVRunningPerformance (

    DayID_in int,

    DailyPerf float null,

    RunningPerf float null,

    NewGroup_bt bit)

    CREATE CLUSTERED INDEX [indexdown] ON #DMVRunningPerformance

    (

    [DayID_in] asc

    )

    WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

    insert into #DMVRunningPerformance

    values (75007,0,0,0)

    insert into #DMVRunningPerformance

    values (75006,1,1,0)

    insert into #DMVRunningPerformance

    values (75005,2,2,0)

    insert into #DMVRunningPerformance

    values (75004,3,3,0)

    declare @NewGroup_bt bit,

    @DayID_in int,

    @DailyPerf float,

    @RunningPerf float

    set @NewGroup_bt=1

    set @DailyPerf=0

    set @RunningPerf=0

    update #DMVRunningPerformance

    set

    @DayID_in=DayID_in, -- anchor column

    NewGroup_bt=@NewGroup_bt,

    DailyPerf=CASE WHEN @NewGroup_bt=1 THEN 0 ELSE @DailyPerf END,

    RunningPerf=CASE WHEN @NewGroup_bt=1 THEN RunningPerf ELSE @RunningPerf END,

    @DailyPerf=DailyPerf,

    @RunningPerf=RunningPerf,

    @NewGroup_bt=0

    FROM #DMVRunningPerformance OPTION (MAXDOP 1)

    select * from #DMVRunningPerformance order by DayID_in desc

    drop table #DMVRunningPerformance

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply