There Must Be 15 Ways To Lose Your Cursors... part 1, Introduction

  • Jeff Moden (4/16/2009)

    Here's the code...

    [font="Courier New"]DECLARE @PrevActualCost MONEY,

            @RunningTotal   MONEY,

            @PrevID         INT

     SELECT @PrevActualCost = 0,

            @RunningTotal   = 0

     UPDATE t

        SET @RunningTotal = RunningTotal = @RunningTotal + ActualCost,

            @PrevID       = ID

       FROM #tmp t WITH (INDEX(0),TABLOCKX)

     OPTION (MAXDOP 1)

     SELECT TOP 1000 * FROM #tmp ORDER BY ID[/font]

    Jeff what is the @PrevActualCost variable for, since it doesn't appear to be used?

  • Matt Whitfield (4/14/2009)


    bruce.trimpop (4/14/2009)


    Bruce

    Here is the SQL i used on 2000 to re-create the data and tables - some mods were required, so hope you can check.

    Is it possible that you could post the cursor you use? I am having trouble understanding how some of the data interacts, and seeing the source would really help.

    My initial reaction would be that the table layout would benefit from a bit of a redesign in terms of being somewhere close to 3NF, which would probably have made the set based approach a lot easier in the first place. I presume that you cannot change table structures however, and will respect that in my solution!

    It does (need redesign) and I can't. :crying:

    Posting the cursor code probably won't help much since I stripped down the functionality to a bare minimum.

    The code does a great deal more than what I've posted and would probably confuse the matter even more. I'll try to describe critical part of the process, maybe that will help. :crazy:

    A client (dbo.client) receives services (dbo.service) for which a charge is calculated and stored in service.charge.

    The client is assessed a fee based on ability to pay (dbo.clientfee) that can be a flat dollar amount or a percentage of the charge. The fee can be determined in 1 of 2 ways. Either a flat amount which is stored in dbo.clientfee or a table (dbo.psfeetablemstr) driven amount based on some client data falling into a set of ranges assigned to that table (dbo.psfeetable). All of this is conditional based on the service date falling within effective date ranges for the client fee (dbo.clientfee) or the fee table (dbo.psfeetablemstr). Also conditional on whether the service is covered by the particular fee (dbo.clientfee or dbo.psfeetablemstr) as determined by service components and their possible values as set up in (dbo.limitcodes) Also conditional on whether the particular client fee is capped by number of services and/or dollar totals and whether the cap has been met. The crux of the problem is the fee table (dbo.psfeetablemstr) is user configurable to use any client related table and any column in that table as the value used to compare against the fee table ranges (dbo.psfeetable) to find what slot the client falls in and what fee to use. I could figure out no way of getting those table/column names out of the fee tables and used in a query without building a dynamic query to return the psfeetable

    row that contained the range that that client's data fell into. And given the unlimited number and in any combination of client fees I could figure no way of doing this without using a cursor.

    I guess it boils down to:

    "Is there a way to do a set based select statement that is based on the tables and columns used in the select being stored in a table?"

    example:

    If a psfeetablemstr row contains table1_vc = 'dbo.client', column1_vc = 'Income', table2_vc = 'dbo.sometable', column2_vc = 'somecolumn'

    from that I need a select that is something like:

    SELECT ft.value3

    FROM dbo.client t1

    JOIN dbo.sometable t2 ON t1.clientid = t2.clientid

    JOIN dbo.psfeetable ft ON t1.income >= ft.valuemin1 and t1.income = ft.valuemin2 and t2.somecolumn <= ft.valuemax2

    JOIN dbo.psfeetablemstr ftm ON ft.psfeetablemstrid = ftm.uniqueid

    JOIN dbo.clientfee cf ON t1.clientid = cf.clientid

    JOIN dbo.service s ON s.clientid = t1.clientid

    Keeping in mind that a psfeetablemstr row might contain 1 or 2 table/column values and might be different table/column values from row to row.

    I can't figure out how to do this other than with a cursor and dynamically building the select based on the table/column names stored in the

    psfeetablemstr table for that particular fee table.

    Actually I think I can pare down the example even more to get to the crux of my problem.

    CREATE TABLE [dbo].[client](

    [clientid] [int] not null,

    [income] [numeric] (12,2)

    )

    CREATE TABLE [dbo].[sometable](

    [clientid] [int] not null,

    [dependents] [int]

    )

    CREATE TABLE [dbo].[clientfee](

    [uniqueid] [int] NOT NULL,

    [clientid] [int] not null,

    [psfeetablemstrid] int,

    CONSTRAINT [pk_clientfee] PRIMARY KEY NONCLUSTERED

    ( [uniqueid] ASC )

    )

    CREATE TABLE [dbo].[psfeetablemstr](

    [uniqueid] [int] NOT NULL,

    [table1_vc] [varchar](20) NULL,

    [column1_vc] [varchar](20) NULL,

    [table2_vc] [varchar](20) NULL,

    [column2_vc] [varchar](20) NULL,

    CONSTRAINT [pk_psfeetablemstr] PRIMARY KEY NONCLUSTERED

    ( [uniqueid] ASC )

    )

    CREATE TABLE [dbo].[psfeetable](

    [uniqueid] [int] NOT NULL,

    [value1min_n] [numeric](12, 2) NOT NULL CONSTRAINT [df_psfeetable1] DEFAULT (0),

    [value1max_n] [numeric](12, 2) NOT NULL CONSTRAINT [df_psfeetable2] DEFAULT (0),

    [value2min_n] [numeric](12, 2) NULL CONSTRAINT [df_psfeetable3] DEFAULT (0),

    [value2max_n] [numeric](12, 2) NULL CONSTRAINT [df_psfeetable4] DEFAULT (0),

    [value3_n] [numeric](11, 3) NOT NULL CONSTRAINT [df_psfeetable5] DEFAULT (0),

    [psfeetablemstrid] [int] NOT NULL,

    CONSTRAINT [pk_psfeetable] PRIMARY KEY NONCLUSTERED

    ( [uniqueid] ASC )

    )

    ---Sample data

    ---dbo.client

    INSERT INTO [dbo].[client] ([clientid], [income])

    VALUES (1,25000)

    INSERT INTO [dbo].[client] ([clientid], [income])

    VALUES (2,10000)

    ---dbo.sometable

    INSERT INTO [dbo].[sometable] ([clientid], [dependents])

    VALUES (1,2)

    INSERT INTO [dbo].[sometable] ([clientid], [dependents])

    VALUES (2,1)

    --dbo.clientfee

    INSERT INTO [dbo].[clientfee] ([uniqueid], [clientid], [psfeetablemstrid])

    VALUES (1,1,1)

    INSERT INTO [dbo].[clientfee] ([uniqueid], [clientid], [psfeetablemstrid])

    VALUES (2,1,2)

    INSERT INTO [dbo].[clientfee] ([uniqueid], [clientid], [psfeetablemstrid])

    VALUES (3,2,1)

    INSERT INTO [dbo].[clientfee] ([uniqueid], [clientid], [psfeetablemstrid])

    VALUES (4,2,2)

    ---dbo.psfeetablemstr

    INSERT INTO [dbo].[psfeetablemstr] ([uniqueid], [table1_vc], [column1_vc], [table2_vc], [column2_vc])

    VALUES (1,'dbo.client','income','dbo.sometable','dependents')

    INSERT INTO [dbo].[psfeetablemstr] ([uniqueid], [table1_vc], [column1_vc], [table2_vc], [column2_vc])

    VALUES (2,'dbo.sometable','dependents',NULL,NULL)

    ---dbo.psfeetable

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (1,0,5000,1,1,10,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (2,5001,10000,1,1,20,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (3,10001,20000,1,1,30,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (4,20001,50000,1,1,40,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (5,50001,9999999,1,1,50,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (6,0,5000,2,3,7,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (7,5001,10000,2,3,17,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (8,10001,20000,2,3,27,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (9,20001,50000,2,3,37,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (10,50001,9999999,2,3,47,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (11,0,5000,4,999,5,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (12,5001,10000,4,999,15,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (13,10001,20000,4,999,25,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (14,20001,50000,4,999,35,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (15,50001,9999999,4,999,45,1)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (16,1,1,0,0,10,2)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (17,2,3,0,0,20,2)

    INSERT INTO [dbo].[psfeetable] ([uniqueid], [value1min_n], [value1max_n], [value2min_n], [value2max_n], [value3_n], [psfeetablemstrid])

    VALUES (18,4,999,0,0,30,2)

    From this for clientfee with uniqueid 1 I need psfeetable row with uniqueid 9 returned

    Reason: clientfee row with uniqueid 1 for client 1 uses psfeetablemstr uniqueid 1. psfeetablemstr uniqueid 1 row is based on income and dependents. client 1 income and dependents fall into the min/max ranges of psfeetable row 9 which is linked to psfeetablemstr uniqueid 1

    From this for clientfee with uniqueid 2 I need psfeetable row with uniqueid 17 returned

    Reason: clientfee row with uniqueid 2 for client 1 uses psfeetablemstr uniqueid 2. psfeetablemstr uniqueid 2 row is based on dependents, client 1 dependents fall into the min/max ranges of psfeetable row 17 which is linked to psfeetablemstr uniqueid 2

    From this for clientfee with uniqueid 3 I need psfeetable row with uniqueid 2 returned

    Reason: clientfee row with uniqueid 3 for client 2 uses psfeetablemstr uniqueid 1. psfeetablemstr uniqueid 1 row is based on income and dependents. client 2 income and dependents fall into the min/max ranges of psfeetable row 2 which is linked to psfeetablemstr uniqueid 1

    From this for clientfee with uniqueid 4 I need psfeetable row with uniqueid 16 returned

    Reason: clientfee row with uniqueid 4 for client 2 uses psfeetablemstr uniqueid 2. psfeetablemstr uniqueid 2 row is based on dependents, client 2 dependents fall into the min/max ranges of psfeetable row 16 which is linked to psfeetablemstr uniqueid 2

    All of this given that you only know dbo.client and dbo.sometable exist and the columns income and dependents exist by what is stored in psfeetablemstr

  • RBarryYoung (4/16/2009)


    dejan.kelemen (4/16/2009)


    Tnx, this is a great solution, didn't come to mind when I was writing the code (it was more classic "hurry up" approach 🙂 ), although I did wrote few cte's. I'll see if I can apply to the real code, there's much more messing about with data. Thanks once more.

    Thanks, dejan. Mind if I use your code as the basis for an example?

    Barry, I would be honored if you used my solution in your article ;-),,,,

    Dejan,

    Glad to see you understand the core of it (the row_number() function). RBarry gave his take on it a bit later, putting it in a subquery, so you can see there (http://www.sqlservercentral.com/Forums/FindPost698108.aspx) how it can be done without the CTE.

  • UMG Developer (4/16/2009)


    Jeff Moden (4/16/2009)

    Here's the code...

    [font="Courier New"]DECLARE @PrevActualCost MONEY,

            @RunningTotal   MONEY,

            @PrevID         INT

     SELECT @PrevActualCost = 0,

            @RunningTotal   = 0

     UPDATE t

        SET @RunningTotal = RunningTotal = @RunningTotal + ActualCost,

            @PrevID       = ID

       FROM #tmp t WITH (INDEX(0),TABLOCKX)

     OPTION (MAXDOP 1)

     SELECT TOP 1000 * FROM #tmp ORDER BY ID[/font]

    Jeff what is the @PrevActualCost variable for, since it doesn't appear to be used?

    My apologies and good catch. It is, in fact, not necessary. It's just an artifact from my first blush on the code and I forgot to delete it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • neufeldb4 (4/13/2009)


    Allow me to make some observations.

    ...If you are going to use an example, and you wish to have credibility, make your example a good one...

    To anyone who frequents SSC I highly doubt RBarryYoung's credibility is an issue.

    This is the type of problem I have seen in 100% of the articles like yours that claim the cursor is a thing to be scorned from the archaic past - none of them show a properly built cursor.

    What is a "properly built cursor" to you? For those who understand the SQL query engine and the optimizer, a "properly built cursor" is a cursor that's never built to begin with. AFAIC cursors have their place, and it's a very very tiny place. Cursors allow you to manually override any efficiency you might get out of the optimizer.

    Finally, If you can offer some advice as to how I can improve the quality of my work, I am only too happy to take it. If you insist on trivializing my work and insulting my intelligence, I will take note of that as well.

    You might not have noticed, but this is an introductory article to a series. If your work is revolves around cursors I can understand how it might be upsetting to be informed that there are better ways. Based on the ratings given to this article it would appear that the community at large finds the article entertaining and informative, and that the majority of readers don't seem to take it so personally.

    I think this is a good and entertaining introductory article for new developers who are used to procedural languages and have no concept of the power of set-based processing versus cursors. Now if you'll excuse me I need to go vote for this article.

  • Manie Verster (4/14/2009)


    Example: I have a client that dowloads their bankstatements from the internet and then upload it into our system. We then give them the opportunity to auto-allocate their bankstatement items to various accounts. This means the the stored proc that does that have to loop through every record and basically strip the text in a column to pieces to see where this needs to be allocated to. I done see that done in a set based procedure. Cursors might not be your best option but in some case they are the only way to go.

    Hi Manie,

    Depending on how accurate the description of your processing is, it sounds as if cursors aren't actually necessary in this case. This also sounds like it is something that might be better done outside of SQL (the string splitting, etc). Obviously I only have your description to go by.

  • Hello All,

    I do all of my work set based as I have NO idea how to use a cursor.

    I have some code left to my by my predecessor.

    It looks at a set of orders and checks that the components that are needed to fill the order are in stock, then it depletes the stock and updates the inventory etc.

    This is all done RBAR - I am sure that I could figure out how to update as a batch but then an entire batch of orders may fail even if we are only short but one product for one order.

    I can send the code if anyone wants a crack at it. Or is this one of those times when the cursor is necessary ( version 2000)

    Thanks

  • Actually, we will also need the DDL for the tables, some sample data, expected results based on the sample data, and you existing code.

  • Mike C (4/16/2009)


    neufeldb4 (4/13/2009)


    Allow me to make some observations.

    ...If you are going to use an example, and you wish to have credibility, make your example a good one...

    To anyone who frequents SSC I highly doubt RBarryYoung's credibility is not an issue.

    Mike! Great to hear from you again, 🙂 One question though the "not" above is a typo, right? 😀

    (ps., any chance that you would be available in June for a SIG talk in South Jersey?)

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • john.arnott (4/16/2009)


    RBarryYoung (4/16/2009)


    dejan.kelemen (4/16/2009)


    Tnx, this is a great solution, didn't come to mind when I was writing the code (it was more classic "hurry up" approach 🙂 ), although I did wrote few cte's. I'll see if I can apply to the real code, there's much more messing about with data. Thanks once more.

    Thanks, dejan. Mind if I use your code as the basis for an example?

    Barry, I would be honored if you used my solution in your article ;-),,,,

    Dejan,

    Glad to see you understand the core of it (the row_number() function). RBarry gave his take on it a bit later, putting it in a subquery, so you can see there (http://www.sqlservercentral.com/Forums/FindPost698108.aspx) how it can be done without the CTE.

    Sorry, john, I didn't mean to take credit for your work there. 🙂

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • WayneS (4/16/2009)


    Thomas (4/15/2009)


    Using a cross join is not itself counter-intuitive. Using it in this fashion to generate a bunch of numbers probably is to most developers.

    I think that this is one of the points that Jeff was making. Developers need to learn how to do set-based operations, so that when they see one it will be intuitive.

    [font="Verdana"]Thanks Wayne, you largely made the point I was wanting to make.

    While I get that row-by-row methods (loops in particular, not cursors which are actually quite arcane) are "simpler, more intuitive, easy to understand" for someone who has done much programming in a procedural language, that doesn't mean a set-based method is arcane.

    The whole point of SQL is to work with sets. Developers need to learn that. Complaining that it is "too complex" seems a bit counter-intuitive to me: isn't the point to learn the language? CROSS JOINs and TOP operators aren't particularly complex parts of SQL either.

    So for a developer who finds the set-based method "arcane", I'd regard that as an indication that they need to spend more time learning SQL.

    As an example of the same style of thinking, the equivalent is like saying "oh, I know that C# is object oriented, but all those objects are too arcane, so I only write procedures." At some point you need to learn a language for its strengths, not just use it to emulate another language.

    Development is a skill that has to be constantly honed. And learning the fundamentals of SQL isn't really too much to ask. SQL is inherently set-based. Just ask Joe Celko! 😀

    [/font]

  • RBarryYoung (4/16/2009)


    Sorry, john, I didn't mean to take credit for your work there. 🙂

    I didn't think you were. Was just being (too?) cute with the way I was pointing out to the OP that there were two samples to look at. I'll join others in this thread in thanking you and the other very experienced SQL experts for all you've done on SSC. I couldn't have answered that post with workable code a year ago, but since reading solutions posted in the forums as well as articles such as yours, I've been able to improve much of what our team does with SQL, both in my own work and by sharing techniques learned here with others. The focus of this web-site seems to be primarily DBA's, but as an application developer, I've found it invaluable.

  • Bruce W Cassidy (4/16/2009)


    WayneS (4/16/2009)


    Thomas (4/15/2009)


    Using a cross join is not itself counter-intuitive. Using it in this fashion to generate a bunch of numbers probably is to most developers.

    I think that this is one of the points that Jeff was making. Developers need to learn how to do set-based operations, so that when they see one it will be intuitive.

    [font="Verdana"]Thanks Wayne, you largely made the point I was wanting to make.

    While I get that row-by-row methods (loops in particular, not cursors which are actually quite arcane) are "simpler, more intuitive, easy to understand" for someone who has done much programming in a procedural language, that doesn't mean a set-based method is arcane.

    The whole point of SQL is to work with sets. Developers need to learn that. Complaining that it is "too complex" seems a bit counter-intuitive to me: isn't the point to learn the language? CROSS JOINs and TOP operators aren't particularly complex parts of SQL either.

    So for a developer who finds the set-based method "arcane", I'd regard that as an indication that they need to spend more time learning SQL.

    As an example of the same style of thinking, the equivalent is like saying "oh, I know that C# is object oriented, but all those objects are too arcane, so I only write procedures." At some point you need to learn a language for its stregths, not just use it to emulate another language.

    Development is a skill that has to be constantly honed. And learning the fundamentals of SQL isn't really too much to ask. SQL is inherently set-based. Just ask Joe Celko! 😀

    [/font]

    Spot on.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It is clear that you gentlemen are missing the point. That developers should strive to refine their craft is obvious. No one disputes that. That set-based solutions are preferred in database operations is also obvious. There is no great revelation there. What is being missed is that:

    A: There are times, which are admittedly more rare these days, when a set-based solution performs worse given the current platform and version than a cursor where you have more control over the resources in use. There are host of reasons why this might happen but it does happen.

    B: There are times, which hopefully we all will avoid, when you are required to execute logic on each row in a table (sorry, various columns in given resultset) in such a way that the current set based operations are incapable of handling. That scenario, again in my travels, has been very rare. Mostly, I have avoided such situations by putting less business logic (or none) in the data layer.

    There are many ways of doing things set-based some are more intuitive than others. As I said, cross joining to sys.columns is more than just about thinking set-based. It is about using the system tables in a clever way. If a developer were populating a list of numbers using a cross join of dozens of different production tables would you not find this odd (and worrisome)?

  • RBarryYoung (4/16/2009)


    Mike C (4/16/2009)


    neufeldb4 (4/13/2009)


    Allow me to make some observations.

    ...If you are going to use an example, and you wish to have credibility, make your example a good one...

    To anyone who frequents SSC I highly doubt RBarryYoung's credibility is not an issue.

    Mike! Great to hear from you again, 🙂 One question though the "not" above is a typo, right? 😀

    (ps., any chance that you would be available in June for a SIG talk in South Jersey?)

    Hi Barry!

    Sorry for the double-negative, I already corrected it 🙂 Depends on when in June, but I'm up for it. Message me offline and we can work out the details.

Viewing 15 posts - 226 through 240 (of 380 total)

You must be logged in to reply to this topic. Login to reply