April 3, 2010 at 12:43 pm
Comments posted to this topic are about the item Working with Queues in SQL Server
April 4, 2010 at 10:36 pm
FYI - the OUTPUT clause is available in sql2k5 as well.
-a
April 5, 2010 at 2:36 am
Setup:
-- For the demo
USE tempdb;
GO
IF OBJECT_ID(N'dbo.WorkQueue', N'U')
IS NOT NULL
DROP TABLE dbo.WorkQueue;
IF OBJECT_ID(N'dbo.GetWorkItem', N'P')
IS NOT NULL
DROP PROCEDURE dbo.GetWorkItem;
IF OBJECT_ID(N'dbo.SetWorkItemComplete', N'P')
IS NOT NULL
DROP PROCEDURE dbo.SetWorkItemComplete;
-- Create table
CREATE TABLE dbo.WorkQueue
(
work_queue_id INTEGER IDENTITY(1,1) NOT NULL
PRIMARY KEY,
name VARCHAR(255) NOT NULL,
status_id TINYINT NOT NULL
CHECK (status_id IN (0, 1, 2)),
status_desc AS
ISNULL
(
CASE status_id
WHEN 0 THEN 'Unprocessed'
WHEN 1 THEN 'In progress'
WHEN 2 THEN 'Processed'
ELSE NULL
END,
'')
);
-- Add data to table
INSERT dbo.WorkQueue
(name, status_id)
SELECT 'A', 0 UNION ALL
SELECT 'B', 0 UNION ALL
SELECT 'C', 0 UNION ALL
SELECT 'D', 0 UNION ALL
SELECT 'E', 0 UNION ALL
SELECT 'F', 0 UNION ALL
SELECT 'G', 0 UNION ALL
SELECT 'H', 0 UNION ALL
SELECT 'I', 0 UNION ALL
SELECT 'J', 0 UNION ALL
SELECT 'K', 0;
GO
-- Index to help find unprocessed items
CREATE NONCLUSTERED INDEX nc1
ON dbo.WorkQueue (status_id);
Procedures:
CREATE PROCEDURE dbo.GetWorkItem
@WorkQueueID INTEGER OUTPUT
AS
BEGIN
SET NOCOUNT ON;
-- Allocate a work item and set it to in-progress
-- TODO: Add lock timeout and error handler
UPDATE Q
SET status_id = 1,
@WorkQueueID = work_queue_id
FROM (
SELECT TOP (1)
Q2.work_queue_id,
Q2.status_id
FROM dbo.WorkQueue Q2 WITH (UPDLOCK, READPAST, ROWLOCK)
WHERE Q2.status_id = 0
ORDER BY
Q2.work_queue_id ASC
) Q;
RETURN CASE WHEN @WorkQueueID IS NULL THEN 999 ELSE 0 END
END;
GO
CREATE PROCEDURE dbo.SetWorkItemComplete
@WorkQueueID INTEGER
AS
BEGIN
SET NOCOUNT ON;
-- Set the work item's status to complete
-- TODO: Add lock timeout and error handler
UPDATE Q
SET status_id = 2
FROM WorkQueue Q
WHERE Q.work_queue_id = @WorkQueueID
AND Q.status_id = 1;
RETURN CASE WHEN @@ROWCOUNT = 0 THEN 999 ELSE 0 END;
END;
GO
Test:
DECLARE @WorkItemID INTEGER,
@ReturnCode INTEGER;
-- Get a work item
EXECUTE @ReturnCode =
dbo.GetWorkItem @WorkItemID OUTPUT;
-- Show the record
SELECT rc = @ReturnCode,
*
FROM dbo.WorkQueue
WHERE work_queue_id = @WorkItemID;
-- Simulate processing delay
WAITFOR DELAY '00:00:03';
-- Set the work item to complete
EXECUTE @ReturnCode =
dbo.SetWorkItemComplete @WorkItemID;
-- Show the record
SELECT rc = @ReturnCode,
*
FROM dbo.WorkQueue
WHERE work_queue_id = @WorkItemID;
GO
DROP TABLE dbo.WorkQueue;
DROP PROCEDURE dbo.GetWorkItem;
DROP PROCEDURE dbo.SetWorkItemComplete;
Results:
rc work_queue_id name status_id status_desc
0 1 A 1 In progress
...
0 1 A 2 Processed
Allocation routine query plan: (compute scalars removed)
April 5, 2010 at 5:00 am
Much better article at http://rusanu.com/2010/03/26/using-tables-as-queues/
April 5, 2010 at 8:11 am
Hey folks. I recently built a system where users share a pool of work items and I too use a table-based queue. So I feel kind of familiar with some of the issues.
It seems to me that the proposed solution handles the initial concurrency problem well. There's no chance that two people will get the same open item in the queue. So it solves what I call the reservation problem. You want users to be able to reserve or lock an item.
However, another problem is getting disconnected. It's the "orphaned work item" issue. The calling app, if a web application, for instance - may drop the connection. Then a row may stay marked as processed/reserved and stay that way. In order to handle these situations, you either have to ensure that it doesn't happen up-front (sometimes not possible) - or have a way to identify rows not being worked by a user.
I've used the table-based method of reserving rows for a work queue with a web application front-end. There are occasions when a session end or the user gets disconnected. I handle the getting disconnected problem by programming the application "session end" event. In the event, I "un-mark" the row in question using the application. That works most of the time - but there are occasions when that event isn't called - and I then have to un-mark those rows another way.
Now, when that happens, there's the problem of identifying rows that have been marked as reserved but are not being handled by a current user. Due to this problem, I chose to do something slightly different than adding a flag column and using the query hints.
I created an additional history table for item reservations. I add a row when a work item is reserved by a user. It includes a session id, item id, expiration flag, start date and end date. The bi-temporal approach allows me to automatically "unlock" items within a specified period of time - such as a day.
It's actually nice to keep a history table for item reservations - and in some cases, it can help satisfy "chain of custody" requirements. It is also a way to handle the orphaned work item issue.
I welcome your thoughts and critiques. Thanks.
Bill Nicolich: www.SQLFave.com.
Daily tweet of what's new and interesting: AppendNow
April 5, 2010 at 8:19 am
One more comment: The use of READPAST should be carefully considered for two reasons:
1. If you need to preserve the order you must exclude it
2. If you are using READ_COMMITED_SNAPSHOT ON at the database level and The transaction isolation level of the session is READ COMMITTED.
* Noel
April 5, 2010 at 8:24 am
Bill Nicolich (4/5/2010)
...when that happens, there's the problem of identifying rows that have been marked as reserved but are not being handled by a current user.
A history table works well. For simple implementations, adding columns to the queue table to store the process/user identifier that owns the resource, and the time it acquired it, can work well (enough) too.
April 5, 2010 at 9:10 am
Paul White NZ (4/5/2010)
Bill Nicolich (4/5/2010)
...when that happens, there's the problem of identifying rows that have been marked as reserved but are not being handled by a current user.A history table works well. For simple implementations, adding columns to the queue table to store the process/user identifier that owns the resource, and the time it acquired it, can work well (enough) too.
When you match up a work item with a user, there are some considerations. What constitutes a user? In my case, a user was a unique session id from the web server. It wasn't a web farm situation. A user could have IE open with one session and FireFox open with another session - and these in my case should be handled separately. So if I just had just a "user id" matching things up in the table, I'd run into problems. So, that's why I added both a session id and a user id in the table.
Bill Nicolich: www.SQLFave.com.
Daily tweet of what's new and interesting: AppendNow
April 5, 2010 at 9:56 am
For the Demo,
If doing this in 2008 and you just use the update "with (READPAST), If you do not move the Delay to before the commit Both sessions will pick up the same record.
Eric.
April 5, 2010 at 9:59 am
when i first looked at the title articles, i was sure it talkes about queues, that work in FIFO - first in -> first out.
but when i saw the row
SELECT TOP(1)
@work_queue_id = work_queue_id, @name = name
FROM
work_queue
WHERE processed_flag = 0
it dosen't ensure me that i will get the first available row from the table!
maybe it will be better to work with the Queue that the sql provides you, rather then using a table, and getting into situation where you find your self with looked rows and so on like Bill said.
April 5, 2010 at 10:32 am
The solution simply illustrates in general what you can do to get around the problem. You can most certainly add ordering, user audits and resolutions to orphaned items. Then again, if you have a necessity to solve these issues in your application you might want to consider going with SQL Server Service Broker 🙂
April 5, 2010 at 11:48 am
Nice Article, and better discussion.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
April 5, 2010 at 3:38 pm
We used to do something similar for sending email here on the site. we had multiple machines that sent, and needed to read a database table to get their list of "things to send".
Our method of handling this was to update the next xx rows with a client machine name. This effectively removed the rows from the Q. Once the update was complete, the client could query for rows that matched it's name, but had not been processed.
Upon the completion of processing, the client updated each row to show it as processed and then started over with another update of new rows in the Q
April 6, 2010 at 12:49 am
Steve Jones - Editor (4/5/2010)
Our method of handling this was to update the next xx rows with a client machine name. This effectively removed the rows from the Q. Once the update was complete, the client could query for rows that matched it's name, but had not been processed.
This is exactly the pattern used at two places I have worked, and the basis for the code posted earlier in the thread. Of course all that was before the days of Service Broker, but with careful coding and consideration for the concurrency and locking issues involved, it can be made to work very well.
April 6, 2010 at 10:00 am
In my view the code fragment from Ahmad is missing a number of considerations that should be taken into account for a practical SQL table-based queueing systems:
* Polling SQL table based queues, as opposed to Service Broker, scales poorly. If you have multiple queues, or you need to keep the queue latency down, then Service Broker (or probably MSMQ) is by far your better answer. Assume 20 queues, and you want to keep the average enqueue time to process time below 50 milliseconds, you may be issuing 200 individual SQL requests a second, incurring a substantial constant overhead
* In a practical environment you should probably only COMMIT the change to the queue entry once you have fully processed the entry. Your SQL 2008 code specifically precludes doing this. If you COMMIT before processing the queue entry you need some way of tracking orphaned entries, which would include updating the queue entry with process identification, date/time, etc before COMMIT and a separately executing process to detect of orphaned entries and requeue them. This also raises the whole question of how long you can afford an orphaned entry to be orphaned for before it is detected, potential duplicate processing reversed, and finally processed
* Probably the most vexed issue in queue processing is poison messages; messages that, for whatever reason, your code cannot handle. You cannot realistically keep the status of the queue entry in the queue entry itself because you may ROLLBACK rather than COMMIT and process the entry repeatedlly. In my view the simplest answer is two connections; one is transacted and handles the queue, one is untransacted and keeps track of the status of the queue entry (number of retries, etc). In retrieving a queue entry you should retrieve both the queue entry and its associated status entry, and only retrieve entries which have not been re-processed excessively
* The READPAST locking hint may skip messages not only if the row is locked (busy being INSERTed) but also if the index is busy. The nett effect is that you may retrieve no messages even though there are messages in the queue, or you may process messages out-of-sequence. An associated issue is that performance may degrade and your apparent queue become less stable as the number of handled queue messages increases. A better approach may be to DELETE the queue entry on processing completion rather than mark it as processed, and keep a separate history log
So, in summary. Simple, yes. Realistic, not close
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply