Best way to handle a table-based queue?

Question

Best way to handle a table-based queue?

Sam-263310

SSC Eights!

Points: 969
More actions
June 6, 2017 at 9:53 am

#326410

NOTE: This is NOT a Service Broker question.
The table defined below is accessed by concurrent jobs that select the next available item and process it. The processing logic may be simple or very complex, depending on the item type, but let's ignore that for the purpose of this question. Item availability is determined by checking for a NULL SessionKey.
CREATE TABLE ItemQueue( ItemQueueId INT IDENTITY PRIMARY KEY, ItemId INT NOT NULL, ItemType TINYINT NOT NULL, SessionKey UNIQUEIDENTIFIER NULL ); CREATE INDEX IX_QueueSessionKey ON ItemQueue(SessionKey);
Any solution has to avoid deadlocks and make sure no two jobs ever select the same item, I'm thinking of two methods, you're welcome to suggest alternatives as long as it's all plain SQL (i.e. no SSSB, SSIS, CLR, etc).
Method 1: Select the first row you can get a lock on, process the selected item and update the SessionKey, all in one transaction.
DECLARE @itemQueueId INT;
BEGIN TRANSACTION;
SELECT TOP 1 @itemQueueId = ItemQueueId FROM ItemQueue WITH ( UPDLOCK, READPAST ) WHERE SessionKey IS NULL ORDER BY ItemQueueId;
UPDATE ItemQueue SET SessionKey = NEWID() WHERE ItemQueueId = @itemQueueId;
COMMIT TRANSACTION; /* . process selected item on separate transaction . */
Method 2: Update one row with a locally generated SessionKey, get the ItemId for the updated row and process that item. (an OUTPUT clause could be used too, same idea).
DECLARE @sessionKey UNIQUEIDENTIFIER = NEWID(); DECLARE @itemQueueId INT;
BEGIN TRANSACTION;
UPDATE TOP(1) ItemQueue SET SessionKey = @sessionKey WHERE SessionKey IS NULL;
SELECT @itemQueueId = ItemQueueId FROM ItemQueue WHERE SessionKey = @sessionKey;
COMMIT TRANSACTION; /* . process selected item on separate transaction . */
QUESTION: Which method is better? (you decide what "better" means)

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply

sgmunson SSC Guru Points: 110640 More actions · Answer 1

In either case, unless "do some work here" is for all practical purposes trivial and not in any way time consumptive, I'd keep the updates as separate transactions, and I'd prefer the method that updates with the new session key value before anything else happens. And that's assuming you also adopt the idea of committing that transaction before doing any other work, however trivial (or not), It would be "better" in terms of the level of concurrency possible to obtain. Such may or may not be important to your situation.

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

Sam-263310 SSC Eights! Points: 969 More actions · Answer 2

Good point, I updated my post so the processing is done on a separate transaction in both cases. That way I think the complexity of the work to be done becomes irrelevant.

jcelko212 32090 SSCrazy Eights Points: 9303 More actions · Answer 3

Sam-263310 - Tuesday, June 6, 2017 9:53 AM
QUESTION: Which method is better? (you decide what "better" means)

Have you considered using the create sequence statement? This would let you number the positions in your queue, guarantee that there always unique, and make the code for changing the status of acute item pretty easy.

Please post DDL and follow ANSI/ISO standards when asking for help.

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 4

Sam-263310 - Tuesday, June 6, 2017 9:53 AM
NOTE: This is NOT a Service Broker question.

I have a table (ItemQueue) with items to be processed by one or more jobs. The table has a column (SessionKey, uniqueidentifier) that's initially null. I'm thinking of two methods for selecting items for processing so that no two jobs ever select the same item:
Method 1: Select the first row you can get a lock on, process the selected item and update the SessionKey, all in one transaction.
DECLARE @itemId INT;
BEGIN TRANSACTION;
SELECT TOP 1 @itemId = ItemId FROM ItemQueue WITH ( UPDLOCK, READPAST ) WHERE SessionKey IS NULL ORDER BY ItemId;
UPDATE ItemQueue SET SessionKey = NEWID() WHERE ItemId = @itemId;
COMMIT TRANSACTION; /* . process selected item on separate transaction . */
Method 2: Update one row with a locally generated SessionKey, get the ItemId for the updated row and process that item. (an OUTPUT clause could be used too, same idea).
DECLARE @sessionKey UNIQUEIDENTIFIER = NEWID();
BEGIN TRANSACTION;
UPDATE TOP(1) ItemQueue SET SessionKey = @sessionKey WHERE SessionKey IS NULL;
SELECT @itemId = ItemId FROM ItemQueue WHERE SessionKey = @sessionKey;
COMMIT TRANSACTION; /* . process selected item on separate transaction . */
QUESTION: Which method is better? (you decide what "better" means)

The answer is "NEITHER". They're both a classic recipe for deadlocks and I'm speaking from some pretty serious experience here.

Can you post the CREATE TABLE along with its indexes and constraints, please? I need to know several things about that table and that will be a whole lot easier than asking the proverbial "20 questions".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sam-263310 SSC Eights! Points: 969 More actions · Answer 5

jcelko212 32090 - Tuesday, June 6, 2017 3:21 PM
Have you considered using the create sequence statement? This would let you number the positions in your queue, guarantee that there always unique, and make the code for changing the status of acute item pretty easy.

I apologize for the lack of DDL in my original posting, I've edited the whole thing to make it more clear. I don't quite follow the logic in your answer but then again it's all my fault for not being more explicit. The main problem I'm facing is the concurrency of the jobs accessing the table.

Sam-263310 SSC Eights! Points: 969 More actions · Answer 6

Jeff Moden - Tuesday, June 6, 2017 9:27 PM
Can you post the CREATE TABLE along with its indexes and constraints, please? I need to know several things about that table and that will be a whole lot easier than asking the proverbial "20 questions".

Sorry about that, I edited my original posting to include the DDL. There's a PK and an index on SessionKey. I followed sgmunson's advice about separating the selection from the actual processing by putting them on different transactions, so I don't have to worry about the complexity of the processing logic (which doesn't access the ItemQueue table at all). I do worry about potential deadlocks due to locks acquired on the table and the index, but I don't know enough about that (hence my cry for help). If needed I could get rid of the index on SessionKey, the queue can be purged periodically to keep it small.

Sergiy SSC Guru Points: 110208 More actions · Answer 7

Sam-263310 - Tuesday, June 6, 2017 9:53 AM
NOTE: This is NOT a Service Broker question.
The table defined below is accessed by concurrent jobs that select the next available item and process it. The processing logic may be simple or very complex, depending on the item type, but let's ignore that for the purpose of this question. Item availability is determined by checking for a NULL SessionKey.
CREATE TABLE ItemQueue( ItemQueueId INT IDENTITY PRIMARY KEY, ItemId INT NOT NULL, ItemType TINYINT NOT NULL, SessionKey UNIQUEIDENTIFIER NULL )
Any solution has to avoid deadlocks and make sure no two jobs ever select the same item, I'm thinking of two methods, you're welcome to suggest alternatives as long as it's all plain SQL (i.e. no SSSB, SSIS, CLR, etc).
Method 1: Select the first row you can get a lock on, process the selected item and update the SessionKey, all in one transaction.
DECLARE @itemId INT;
BEGIN TRANSACTION;
SELECT TOP 1 @itemId = ItemId FROM ItemQueue WITH ( UPDLOCK, READPAST ) WHERE SessionKey IS NULL ORDER BY ItemId;
UPDATE ItemQueue SET SessionKey = NEWID() WHERE ItemId = @itemId;
COMMIT TRANSACTION; /* . process selected item on separate transaction . */
Method 2: Update one row with a locally generated SessionKey, get the ItemId for the updated row and process that item. (an OUTPUT clause could be used too, same idea).
DECLARE @sessionKey UNIQUEIDENTIFIER = NEWID();
BEGIN TRANSACTION;
UPDATE TOP(1) ItemQueue SET SessionKey = @sessionKey WHERE SessionKey IS NULL;
SELECT @itemId = ItemId FROM ItemQueue WHERE SessionKey = @sessionKey;
COMMIT TRANSACTION; /* . process selected item on separate transaction . */
QUESTION: Which method is better? (you decide what "better" means)

You have a PK on the table.
So, that's what you should be using to identify a row in it.

DECLARE @ItemQueueIdItemQueueId INT, @rc int = 0;

WHILE @rc = 0 BEGIN SELECT TOP 1 @ItemQueueId = ItemQueueId FROM ItemQueue WHERE SessionKey IS NULL;

UPDATE ItemQueue SET SessionKey = NEWID() WHERE ItemQueueId= @ItemQueueId AND SessionKey IS NULL; SELECT @rc = @@ROWCOUNT END

_____________
Code for TallyGenerator

Sam-263310 SSC Eights! Points: 969 More actions · Answer 8

Sergiy - Wednesday, June 7, 2017 7:36 AM
You have a PK on the table.
So, that's what you should be using to identify a row in it.
DECLARE @ItemQueueIdItemQueueId INT, @rc int = 0;
WHILE @rc = 0 BEGIN SELECT TOP 1 @ItemQueueId = ItemQueueId FROM ItemQueue WHERE SessionKey IS NULL;
UPDATE ItemQueue SET SessionKey = NEWID() WHERE ItemQueueId= @ItemQueueId AND SessionKey IS NULL; SELECT @rc = @@ROWCOUNT END

Good point there, using the PK to select the item is the right thing to do, I updated the original posting. But the logic you suggest has a little problem: if there are no items to process it will stay in the loop until a new item arrives, and that's a waste of CPU time.

Sergiy SSC Guru Points: 110208 More actions · Answer 9

Yes, it's not a complete solution, of course.

You still need to have an endless loop, as you want to pick a new item in the queue as soon as it's arrived, but you may wish to WAITFOR DELAY if there are no items with SessionKey is null.

Another thing you need to take care of - abandoned sessions.

If for some reason processing was not finished for an item (error, connection dropped, or something else) the item will stay with assigned SessionKey of the session which is long gone.

You need to be able to verify if the session which started processing any particular item is still alive and running.

For that you're gonna need a real SessionKey, not a useless GUID generated by NEWID().

_____________
Code for TallyGenerator

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 10

I feel compelled to ask the question: why is it NOT a service broker question?

You're essentially trying to recreate what SB does. Just curious why we are looking to avoid SB in this case.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Sam-263310 SSC Eights! Points: 969 More actions · Answer 11

Matt Miller (4) - Wednesday, June 7, 2017 8:53 AM
I feel compelled to ask the question: why is it NOT a service broker question?
You're essentially trying to recreate what SB does. Just curious why we are looking to avoid SB in this case.

Service Broker is not allowed where I work. I tried to argue about it a few years ago, you can tell how successful I was. Sigh.....

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 12

Sam-263310 - Wednesday, June 7, 2017 9:28 AM
Matt Miller (4) - Wednesday, June 7, 2017 8:53 AM
I feel compelled to ask the question: why is it NOT a service broker question?
You're essentially trying to recreate what SB does. Just curious why we are looking to avoid SB in this case.
Service Broker is not allowed where I work. I tried to argue about it a few years ago, you can tell how successful I was. Sigh.....

You're not alone there. I have the same restriction placed on me.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 13

Ok... first, there is no need for an explicit transaction. Having an explicit transaction for the two examples is a classic setup on a table for a pot wad of deadlocks.

Second, there's no need for an session key (GUID, Sequence, or whatever) if you simply use the current identity column in the table as the session id.

Third, you need to be able to detect failed or long running sessions. The only way to do that is with a start and end date.

Last but not least, you want this to be as fast as possible so both indexes are ever-increasing and narrow.

Here's your table modified to accommodate all of the above and populate it with some fictitious example data.

--===== If the test table already exists, drop it make reruns in SSMS easier. IF OBJECT_ID('tempdb..#ItemQueue','U') IS NOT NULL DROP TABLE #ItemQueue ; GO --===== Create the test table and the given indexes CREATE TABLE #ItemQueue ( ItemQueueId INT IDENTITY(1,1) PRIMARY KEY, ItemId INT NOT NULL, ItemType TINYINT NOT NULL, StartDT DATETIME NULL, EndDT DATETIME NULL ) ; CREATE INDEX IX_QueueWIP ON #ItemQueue(StartDT,ItemQueueId) ; --===== Do the initial population with sufficiently random test data INSERT INTO #ItemQueue (ItemId,ItemType) SELECT TOP 10000 ItemID = ABS(CHECKSUM(NEWID())%500)+1000 ,ItemType = ABS(CHECKSUM(NEWID())%10)+1 FROM sys.all_columns ac1 CROSS APPLY sys.all_columns ac2 ;

Here's a runnable example (setup 3 or 4 windows with the same code and a whole lot more rows and run it all at the same time in a loop to see it in real action) that will seriously help you avoid deadlocks on the queue table. I kept it simple just to demonstrate the principle. Modify to your heart's content. Just don't ever use an explicit transaction (or multi-part transaction if your server is setup that way) on the queue table. If you end up with something that has a start date without an end date after a certain amount of time, then you have a way to go back to failed runs and re-enter them into the queue just by changing the StartDT to a NULL. Or, keep the old row so that you keep track of the failures and enter a new row for the same thing.

--===== Local Variables DECLARE @ItemQueueId INT ,@StartDT DATETIME --not necessary but gives nice warm fuzzies that it worked. ; --===== Get an assignment from the queue. This does an update for the StartDT -- and simultaneouly assigns the ItemQueueID to our working variable. WITH cteAssign AS ( SELECT TOP (1) StartDT ,ItemQueueId FROM #ItemQueue WITH (READPAST) --Increases concurrency a bit WHERE StartDT IS NULL ORDER BY ItemQueueId --Forces FIFO ) UPDATE tgt SET @StartDT = tgt.StartDT = GETDATE() ,@ItemQueueId = tgt.ItemQueueId FROM cteAssign tgt ; --===== Do some task here. In this case, we'll just display the assignment variables. SELECT ItemQueueId = @ItemQueueId ,StartDT = @StartDT ; --===== Once done with the task, mark the end date UPDATE tgt SET EndDt = GETDATE() FROM #ItemQueue tgt WHERE tgt.ItemQueueId = @ItemQueueId ;

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sergiy SSC Guru Points: 110208 More actions · Answer 14

Sam-263310 - Wednesday, June 7, 2017 9:28 AM
Matt Miller (4) - Wednesday, June 7, 2017 8:53 AM
I feel compelled to ask the question: why is it NOT a service broker question?
You're essentially trying to recreate what SB does. Just curious why we are looking to avoid SB in this case.
Service Broker is not allowed where I work. I tried to argue about it a few years ago, you can tell how successful I was. Sigh.....

Most of MS shipped tools carry the birthmarks: inadequate design and careless coding.
Therefore they all:
- do not scale;
- take enormous amount of resources;
- fail to perform in timely manner.

They all are good for "home office" kind of environments, but as soon as you get to hundreds of anything (jobs, services, packages, tasks, orchestrations, etc.) those tools tend to commit suicide, and take the whole server down with them.

So, don't push for Service Broker too hard. You might regret it if succeeded.

_____________
Code for TallyGenerator