Avoiding cursors

  • Hi,

    I need to create stored procedure which accepts a value. This value is then set as the maximum total value. A number of rows are then selected from a table where the value in column A is totaled up until it is not greater than the input value. How would I do this without using a cursor?

    Many thanks

    Reet

  • Reet,

    You'll undoubtably get several suggestions using a self-joined correlated sub-query which are usually pretty darned slow.  They'll usually take about 29 seconds to do 10,000 records and the duration get's exponentially worse as the number of records increase.  In fact, a cursor or WHILE LOOP is sometimes faster (on larger tables) than these methods.  The code for the correlated sub-queries is, however, very simple and works great for very small and tiny tables.

    Now, if you not an ANSI purist, don't mind using a temp table, and you have a large number of rows in the table (or just want some flat out high speed performance), then try the following...

    First, let's make a test table to simulate your table... we'll make it with 100,000 rows just so it's a good test (I tested with 4 MILLION rows and it [the solution code, not the test table build code] ran in just over a minute)...

    --============================================================================

    -- Create a test table

    --============================================================================

    --===== If the test exists, drop it so we can rerun if we need to

         IF OBJECT_ID('dbo.jbmRunTotTest') IS NOT NULL

            DROP TABLE dbo.jbmRunTotTest

    --===== Suppress auto-display of rowcounts for appearance and speed

        SET NOCOUNT ON

    --===== Create and populate the jbmRunTotTest table on the fly

     SELECT TOP 100000 --< Change this number to as large as 4000000 for load test

            IDENTITY(INT,1,5) AS PK,

            CAST(RAND(CAST(NEWID() AS VARBINARY))*50+50 AS INT) AS ColA

       INTO dbo.jbmRunTotTest

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

      ALTER TABLE dbo.jbmRunTotTest

        ADD CONSTRAINT PK_jbmRunTotTest_ColA PRIMARY KEY CLUSTERED (PK)

    --===== Allow the general public to use it

      GRANT ALL ON dbo.jbmRunTotTest TO PUBLIC

    ...and now, let's solve the problem... the code is fairly well documented so I'll just let you get to it...

    --============================================================================

    -- Return all PK's (other cols returned for proof) that have a running total

    -- less than some predetermined number

    --============================================================================

    --===== If the working temp table already exists, drop it for reruns

         IF OBJECT_ID('TempDB..#Temp') IS NOT NULL

            DROP TABLE #Temp

    --===== Clear any cached data so we can get a true measure of how fast this is

       DBCC DROPCLEANBUFFERS

       DBCC FREEPROCCACHE

    --===== Declare local variables

    DECLARE @StartTime DATETIME --< Just to measure run duration

        SET @StartTime = GETDATE()

    DECLARE @MaxVal INT

        SET @MaxVal = 100000 --< This is the predetermined max running total

    DECLARE @RunTot BIGINT   --< This let's us calc the running total

        SET @RunTot = 0

    --===== Create and populate the temporary working table on the fly

     SELECT IDENTITY(INT,1,1) AS RowNum,

            CAST(PK AS INT) AS PK, --< The Primary Key Col from your table

            ColA, --< The column that contains the value from your table

            CAST(0 AS BIGINT) AS RunTotal --< To be calculated

       INTO #Temp --< The new temporary working table

       FROM jbmRunTotTest t1 --< Our test table (sub your table name here)

      ORDER BY ColA --< Rows with smallest values get lowest RowNums

    --===== Add a new clustered Primary Key to preserve the order during the update

         -- because UPDATE doesn't like an ORDER BY

      ALTER TABLE #Temp

        ADD PRIMARY KEY CLUSTERED (RowNum)

    --===== Update the temporary working table with the running total in RowNum order

         -- using SQL Server's proprietary UPDATE statement.

     UPDATE t1

        SET @RunTot = RunTotal = @RunTot+t1.ColA

       FROM #Temp t1

    --===== Select all the rows where the running total is less than the predetermined max

     SELECT *

       FROM #Temp

      WHERE RunTotal < @MaxVal

    --===== All done, display the duration

      PRINT CONVERT(VARCHAR(12),DATEADD(ms,DATEDIFF(ms,@StartTime,GETDATE()),0),114)

          + ' Run Duration (hh:mm:ss:mil)'

    Of course, if you are using SQL Server 2005, they have some pretty slick functions to do running totals.  I don't have 2005 to test with on my home machine otherwise, I'd show you an example.  Maybe someone who does have access to 2005 could post a solution...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply