Cursors and concurrency

  • Hi guys,

    I have a stored procedure which contains a cursor to loop through a queue (The queue is just a plain table).

    Each item in the queue has a Status and is only picked up by the cursor if the Status is 'PENDING'. For each 'PENDING' row, the cursor will loop through and update the status to 'EXECUTING' and then 'SUCCESSFUL' upon completion.

    Also, each item in the queue has a dependency on the item before it so the execution will first check to see if the preceding item has successfully been executed.

    Here is the cursor section of my stored proc:

    DECLARE @ID int

    DECLARE @status varchar(50)

    DECLARE @CommandText VARCHAR(200)

    DECLARE @command varchar(200)

    DECLARE myCursor CURSOR FAST_FORWARD FOR SELECT A.ID, A.Status, A.CommandText, A.PreviousQueueID

    FROM dbo.Queue A

    WHERE A.Status = 'PENDING'

    ORDER BY A.ID

    OPEN myCursor

    FETCH NEXT FROM myCursor INTO @ID

    ,@Status

    ,@CommandText

    ,@PreviousQueueID

    WHILE @@FETCH_STATUS = 0

    BEGIN

    IF @status = 'PENDING'

    BEGIN

    IF @PreviousQueueId >= 0

    BEGIN

    IF EXISTS( SELECT 1

    FROM dbo.Queue

    WHERE ID = @PreviousQueueId

    AND Status = 'SUCCESSFUL')

    BEGIN

    UPDATE dbo.Queue

    SET Status = 'EXECUTING'

    WHERE ID = @ID

    BEGIN TRY

    SET @command = 'EXEC' + @CommandText

    EXEC SP_EXECUTESQL @command

    UPDATE dbo.Queue

    SET Status = 'SUCCESSFUL'

    WHERE ID = @ID

    END TRY

    BEGIN CATCH

    UPDATE dbo.Queue

    SET Status = 'FAILED'

    WHERE ID = @ID

    END CATCH

    END

    END

    END

    FETCH NEXT FROM myCursor INTO@ID

    ,@Status

    ,@CommandText

    ,@PreviousQueueID

    END

    CLOSE myCursor

    DEALLOCATE myCursor

    My problem is about concurrency and how to manage any race conditions when say there are 2 sessions that are both executing this stored proc. It could end up that the same item will get executed twice, once by each session.

    For example, there are 3 items in the queue. The following scenrio could happen:

    Session1: Cursor picks up all 3 items and update 1st item to EXECUTING

    Session2: Picks up only 2 items since the 1st one is already in EXECUTING status.

    Session1: Completes exectution of 1st item and sets status to SUCCESSFUL. Fetches the 2nd item, tests if 1st item is successful, and starts executing 2nd item.

    Session2: Inspects 2nd item and tests to see if its prior item has been successfully completed. Since 1st item has SUCCESSFUL status, then session2 will also start executing 2nd item.

    As you can see, 2nd item is being executed twice which I don't want.

    How can the above scenario be prevented? I can't figure out if I need to do some sort of locking or not.

    Please help.

    Thanks.

  • First of all you can have a look at your cursor type. FAST_FORWARD means your cursor will fetch all rows in one go, then to feed them one row at a time to your code. You will need some sort of a DYNAMIC cursor so that with each fetch the actual state of the table in the database is re-read.

    But better yet, re-think your process logic: you are not processing a list of jobs in your procedure, instead you are just endlessly processing the first available job from the queue. In other words, don't use a cursor at all. Just code it as a 'while 1 = 1' (or while exists(<some condition>)) that tries to update status on the first available, -non-locked- job from the queue table. Once your procedure manages to update the status, it can start processing that row. I would suggest to use an update statement like this to both fetch the job to process and lock it:

    update top (1) q

    set

    @jobid = q.jobid,

    status = 'processing'

    from dbo.queue q with (readpast)

    where q.status = 'queued'

    This takes (any) job from the queue that has status 'queued', puts an update lock on it to avoid other instances of the same procedure to work on the same job, sets the status to 'processing', all at the same time. One disadvantage is that update does not allow for an 'order by' clause, i.e. that's why I stated any job was selected. You can do some fancy tricks with sub queries though to achieve the ordering. All I wanted to point out here is that you shouldn't use a cursor, but use seperate -exclusively locking- statements for retrieving the queue entry to process.



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • Hi,

    You can avoid concurrency issues by using UPDLOCK and READPAST inside a transaction, and not using a cursor. The following code illustrates the technique:

    USE tempdb

    GO

    CREATE TABLE dbo.QueueStatus

    (

    status_idINTEGER NOT NULL

    CONSTRAINT [PK dbo.QueueStatus status_id]

    PRIMARY KEY,

    status_descVARCHAR(10) COLLATE Latin1_General_Bin2 NOT NULL

    CONSTRAINT [UQ dbo.QueueStatus status_desc]

    UNIQUE

    )

    GO

    INSERT dbo.QueueStatus

    (status_id, status_desc)

    VALUES

    (0, 'Pending'),

    (1, 'Executing'),

    (8, 'Successful'),

    (9, 'Failed')

    CREATE TABLE dbo.Queues

    (

    queue_idINTEGER IDENTITY PRIMARY KEY,

    status_idINTEGER NOT NULL

    DEFAULT (0),

    CONSTRAINT [FK dbo.Queues dbo.QueueStatus status_id]

    FOREIGN KEY (status_id)

    REFERENCES dbo.QueueStatus (status_id),

    commandNVARCHAR(MAX) NOT NULL,

    last_queue_idINTEGER NULL

    CONSTRAINT [FK dbo.Queues queue_id]

    FOREIGN KEY (last_queue_id)

    REFERENCES dbo.Queues (queue_id)

    )

    GO

    CREATE NONCLUSTERED INDEX [IX dbo.Queues status_id = 0]

    ON dbo.Queues (status_id) INCLUDE (last_queue_id, command)

    CREATE PROCEDURE dbo.ProcessQueue

    AS

    SET NOCOUNT, XACT_ABORT ON

    DECLARE @queue_idINTEGER,

    @commandNVARCHAR(MAX)

    WHILE (1 = 1)

    BEGIN

    BEGIN TRANSACTION

    ;

    WITH NextQueueItem AS

    (

    SELECT TOP (1)

    q.queue_id,

    q.status_id,

    q.command

    FROM dbo.Queues AS q WITH (UPDLOCK, READPAST)

    WHERE

    q.status_id = 0 -- Pending

    AND EXISTS

    (

    -- No previous step

    SELECT 1

    WHERE

    q.last_queue_id IS NULL

    UNION ALL

    -- Or previous was successful

    SELECT 1

    FROM dbo.Queues AS prev WITH (READPAST)

    WHERE

    prev.queue_id = q.last_queue_id

    AND prev.status_id = 8-- Successful

    )

    ORDER BY

    q.queue_id

    )

    UPDATE NextQueueItem

    SET status_id = 1,-- Executing

    @queue_id = NextQueueItem.queue_id,

    @command = NextQueueItem.command

    IF@@ROWCOUNT = 0

    BEGIN

    -- Queue was empty, exit the WHILE loop

    COMMIT TRANSACTION

    BREAK

    END

    BEGIN TRY

    RAISERROR ('Processing queue item #%i, command = %s', 0, 1, @queue_id, @command)

    -- Run the queued command

    EXECUTE sys.sp_executesql

    @statement = @command

    -- Success

    UPDATE dbo.Queues

    SET status_id = 8

    WHERE

    queue_id = @queue_id

    COMMIT TRANSACTION

    END TRY

    BEGIN CATCH

    -- Rollback if transaction is doomed

    IFXACT_STATE() = -1

    ROLLBACK TRANSACTION

    DECLARE

    @Message NVARCHAR(2048) = ERROR_MESSAGE(),

    @Number INTEGER = ERROR_NUMBER()

    -- Record failure

    UPDATE dbo.Queues

    SET status_id = 9-- Failed

    WHERE

    queue_id = @queue_id

    -- If the transaction is still open, commit it

    IF XACT_STATE() = 1

    COMMIT TRANSACTION

    RAISERROR('Error %i: %s', 16, 1, @Number, @Message)

    END CATCH

    END-- WHILE 1 = 1

    -- Some test data

    INSERT dbo.Queues

    (command, last_queue_id)

    VALUES

    (N'SELECT table_count = COUNT_BIG(*) FROM sys.tables', DEFAULT),

    (N'SELECT error = 1/0', DEFAULT),

    (N'SELECT procedure_count = COUNT_BIG(*) FROM sys.procedures', 1)

    GO

    -- Show queue entries

    SELECT

    q.queue_id,

    q.status_id,

    qs.status_desc,

    q.command,

    q.last_queue_id

    FROM dbo.Queues AS q

    JOIN dbo.QueueStatus AS qs ON

    qs.status_id = q.status_id

    GO

    -- Process the queue

    EXECUTE dbo.ProcessQueue

    GO

    -- Show queue entries

    SELECT

    q.queue_id,

    q.status_id,

    qs.status_desc,

    q.command,

    q.last_queue_id

    FROM dbo.Queues AS q

    JOIN dbo.QueueStatus AS qs ON

    qs.status_id = q.status_id

    GO

    -- Tidy up

    DROP TABLE

    dbo.Queues,

    dbo.QueueStatus

    DROP PROCEDURE

    dbo.ProcessQueue

    GO

  • Thanks for the suggestions guys,

    but I'm wondering with the WHILE 1 = 1 method, wouldn't this mean that once the stored procedure is called, it would never return? So would the idea behind this be that the stored procedure is called once and once only? (or else you'd get an ever increasing number of infinitely executing stored procs!)

    Also R.P.Rozema, when you say by using DYNAMIC cursor, the 'state' of the table is re-read on each fetch, by 'state' do you mean whether the table is locked?

    Having a quick read through SQL BOL, I'm not sure I quite understand the difference between using the DYNAMIC vs FAST_FORWARD cursor. Can they be used together or only 1 or the other?

  • elogen (9/7/2011)


    but I'm wondering with the WHILE 1 = 1 method, wouldn't this mean that once the stored procedure is called, it would never return? So would the idea behind this be that the stored procedure is called once and once only? (or else you'd get an ever increasing number of infinitely executing stored procs!)

    Have you tried running the code sample I wrote for you?

  • Hi Paul,

    I've just tried your code, and yes it's returning..I missed the BREAK part....

    thanks

  • elogen (9/7/2011)


    Also R.P.Rozema, when you say by using DYNAMIC cursor, the 'state' of the table is re-read on each fetch, by 'state' do you mean whether the table is locked?

    Having a quick read through SQL BOL, I'm not sure I quite understand the difference between using the DYNAMIC vs FAST_FORWARD cursor. Can they be used together or only 1 or the other?

    I am sorry, I shouldn't have mentioned the dynamic cursor: I only intended to steer you towards the solution presented in detail by Paul. i.e. to use a while loop (re-)querying for one row at a time and not use a cursor at all. I highly recommend using the while loop method over using a cursor. However, you definitely should one day re-read BOL on the different types of cursors as they behave significantly different (and you can indeed only use one of the flags on the same cursor). However, none of the available cursor types are suited for what you need now, so you better leave that for later.

    One small improvement on Paul's method is to use the update statement for retrieving the row to process plus setting it's new status in one go. Doing a select with lock hints (readpast, updlock) does not result in the exact same un-divide-able lock pattern and can lead to slow performance.



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • This is not my day :doze:. You can forget about my improvement: Paul excellently implemented that already too...

    I'm not sure though whether the lock hint on the select in the cte is needed, or even whether it actually does anything in there. Does anyone know for sure whether it is needed or not?



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • R.P.Rozema (9/7/2011)


    I'm not sure though whether the lock hint on the select in the cte is needed, or even whether it actually does anything in there. Does anyone know for sure whether it is needed or not?

    It is needed to make it robust; specifically under a heavy concurrent workload: SQL Server contains an optimization to take U locks when reading (locating rows that qualify for the update), promoting them to X locks when the update physically occurs.

    Unfortunately, this only avoids deadlocks when the reading and writing plan operators use the same access method. If the read-cursor side of the plan reads from a non-clustered index (and takes key U locks on that), and the update is performed by a heap update or clustered index update iterator (taking X locks there) the scheme falls apart.

    The way to avoid deadlocks where the access method differs between read-cursor and write-cursor is to take U locks explicitly on the read side of things. It is still possible to deadlock in this situation if the table is a heap and the access method is a heap scan, but that's getting rather too deep into it, and should not be a common scenario in practice (my example provides a non-clustered index for this purpose).

  • Thanks for the explanation, Paul.



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply