May 22, 2020 at 1:48 am
Hello Gurus
I have a query that looks something like this
delete a
from table a
inner join (select top (100) <identityfield> from table b) x
inner join a.<identityfield>=x.<identityfield>
Before I get judged for something like this, I have to have this process delete in small chunks so that the application can continue to function without having excessive long blocking calls. And, yes the table above is an inner join to itself.
My DBA is telling me that the parallelism in the query plan is showing 4 and that this query is "blocking" itself. Does that makes sense? She wants me to change the code for this query to Option MaxDop 1.
Good advice? Or is there a better way?
Thanks in advance for your time.
Steve
May 22, 2020 at 9:11 am
Assuming <identityfield> has a primary key or unique constraint on it, I think this will do the same:
DELETE TOP (100)
FROM table;
On the wider question of parallelism, how many CPUs do you have in total on the server? It may be wise to set server-wide max degree of parallelism to half the number of processors you have, so that no one query can take up all processors at once. Mileage will vary: that's not a hard-and-fast rule. As for whether to stop this particular query going parallel, if all you're doing is deleting the first 100 rows you find, that shouldn't go parallel at all. If it does, check your cost threshold for parallelism. The default is 5 (at least it used to be for a long time; I'm not sure whether it still is) and that's far too low for most modern workloads. Try increasing it to something between 30 and 50 and see how that goes.
John
May 22, 2020 at 2:51 pm
I guess that the more specific question is, If I run this delete in a Loop is is possible that successive delete statements in the same session could block themselves because the query engine is showing a parallelism of 4.
Facts to consider.
It would seem that a hint of maxdop 1 for the query would solve this, but something that I read on Brent Ozar's site where he says that you can "tie the queries hands behind its back" with the Option MaxDop 1 setting has given me pause.
I guess my quandary is that I assumed that blocking or deadlocks had to come from disparate sessions. I never expected a single session, where a loop statement that issues delete commands, could block itself.
Steve
May 22, 2020 at 2:55 pm
Are you running the loop within a transaction? Have you tried substituting your DELETE statement with mine?
John
May 22, 2020 at 2:58 pm
I am not using a transaction, as I do not need the whole operation to be atomic.
The question I have about yours is that there is no native ordering in SQL Server without an order by clause.
so would you suggest that this would be a reasonable solution?
DELETE TOP (100)
FROM table
ORDER BY <identityfield> ASC;
Steve
May 22, 2020 at 3:11 pm
Steve
Indeed, without an ORDER BY clause, you don't know which rows are going to be deleted. Your code also had no ORDER BY; I wrote mine to be equivalent to yours. Yes, your amended code will work, depending on what your precise requirement is. If you just need to run the loop until all the rows are gone, an ORDER BY isn't necessary. Have you considered TRUNCATE TABLE?
I don't know why your code is blocking itself. I'd need to see the whole lot, including the loop, and the evidence of your blocking chain, to comment further.
John
May 22, 2020 at 3:20 pm
Thank you John
You have been very helpful. I think I have some ideas based on your suggestions that will help me to proceed.
I did run some comparisons in execution plans based on your suggestions. As you probably guessed the execution plans are completely different and your way is much better. Thank you very much.
Steve
May 22, 2020 at 4:44 pm
Here's some code... it obviously won't run the way it is but it will give you an idea of how to do this and make your DBA happy.
--===== Local Variables
DECLARE @BatchSize INT = 100 --Change this to whatever works.
;
--===== Basically, start deleting and continue until we have a batch
-- that deletes less than @BatchSize rows because that's all that's left.
-- DO SEE THE WARNING ABOUT THE MISSING WHERE CLAUSE BECAUSE YOUR
-- ORIGINAL CODE DIDN''T HAVE ONE AND WOULD HAVE LEFT TO DELETING
-- EVERYTHING IN THE TABLE!!!! IF THAT''S WHAT YOU REALLY WANT TO
-- DO, THEN USE TRUNCATE ON THE TABLE INSTEAD OF THIS CODE.
WHILE 1=1
BEGIN
--===== DELETE a bath of rows
DELETE TOP (@BatchSize)
FROM dbo.SomeTable
WHERE ****** LOOK!!!! YOUR ORIGINAL CODE WAS MISSING A WHERE CLAUSE!!!!! ******
ORDER BY <identityfield>
;
--===== If we just deleted something less than the batch size,
-- then we're also out of things to delete.
-- Break out of this loop because we're all done.
IF @@ROWCOUNT < @BatchSize
BREAK
;
--===== Let other people have a chance at the table if any blocking occurs.
-- You might be able to reduce this to just a second or two.
WAITFOR DELAY '00:00:05';
END
;
--Jeff Moden
Change is inevitable... Change for the better is not.
May 22, 2020 at 4:55 pm
I think the reason your version is blocking itself is the self-join - and the derived table. Locks have to be taken out on each 'copy' of the table and those are going to conflict on parallel threads.
Actually - I am surprised that you did not run into a deadlock scenario from the parallel threads.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply