delete as head blocker

Question

Post reply

delete as head blocker

vsamantha35

SSChampion

Points: 11551

June 15, 2023 at 6:59 am

Go to Answer

#4211709

Hi All,

Need some help in finding out of what is the problem in the query and how can we make the query faster.

On one of the production environment, a DELETE statement was running over 50 mins and blocked more than 100 sessions.

I wasn't able to get the actual plan. However, including estimated plan in the attached zip file.

Below is the SQL statement.

DELETE FROM c_repos_applied_lock
WHERE rowid_lock IN (SELECT c_repos_applied_lock.rowid_lock
FROM c_repos_applied_lock,
c_repos_table
WHERE ( c_repos_applied_lock.job_type_str = 'A' )
AND ( c_repos_table.rowid_table =
c_repos_applied_lock.rowid_table )
AND ( (
Datediff (second, c_repos_applied_lock.create_date,
CURRENT_TIMESTAMP) ) > (
c_repos_table.batch_lock_wait_seconds * 3 ) ))

ran sp_whoisactive during the time of issue and collected some info.

spid 233

<--sql text -->

<?query --
DELETE FROM C_REPOS_APPLIED_LOCK WHERE ROWID_LOCK IN (SELECT C_REPOS_APPLIED_LOCK.ROWID_LOCK FROM C_REPOS_APPLIED_LOCK, C_REPOS_TABLE WHERE (C_REPOS_APPLIED_LOCK.JOB_TYPE_STR = 'A') AND (C_REPOS_TABLE.ROWID_TABLE = C_REPOS_APPLIED_LOCK.ROWID_TABLE) AND ((DATEDIFF (second, C_REPOS_APPLIED_LOCK.CREATE_DATE, CURRENT_TIMESTAMP )) > (C_REPOS_TABLE.BATCH_LOCK_WAIT_SECONDS * 3)))
--?>

<--lock info>

<Database name="CMX_ORS">
<Locks>
<Lock request_mode="S" request_status="GRANT" request_count="1" />
</Locks>
<Objects>
<Object name="C_REPOS_APPLIED_LOCK" schema_name="dbo">
<Locks>
<Lock resource_type="KEY" index_name="NI_APPLIED_LOCK_EWGPS3" request_mode="U" request_status="GRANT" request_count="56" />
<Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="NI_APPLIED_LOCK_EWGPS3" request_mode="IU" request_status="GRANT" request_count="3" />
</Locks>
</Object>
<Object name="C_REPOS_TABLE" schema_name="dbo">
<Locks>
<Lock resource_type="OBJECT" request_mode="Sch-S" request_status="GRANT" request_count="1" />
</Locks>
</Object>
</Objects>
</Database>

spid = 657 update (victim)

sql text :

<?query --
UPDATE C_REPOS_APPLIED_LOCK SET LOCK_QUERY_SQL=@P0, LOCK_EXCLUSIVE_IND=@P1, JOB_TYPE_STR=@P2, MODULE_NAME=@P3, INTERACTION_ID=@P4, LAST_UPDATE_DATE=@P5, UPDATED_BY=@P6 WHERE ROWID_TABLE = @P7 AND LOCK_GROUP_STR=@P8
--?>

<--lock info>

<Database name="CMX_ORS">
<Locks>
<Lock request_mode="S" request_status="GRANT" request_count="1" />
</Locks>
<Objects>
<Object name="(null)">
<Locks>
<Lock resource_type="METADATA.DATABASE_PRINCIPAL" principal_name="dbo" request_mode="Sch-S" request_status="GRANT" request_count="1" />
</Locks>
</Object>
<Object name="C_REPOS_APPLIED_LOCK" schema_name="dbo">
<Locks>
<Lock resource_type="KEY" index_name="NI_APPLIED_LOCK_EWGPS3" request_mode="U" request_status="WAIT" request_count="1" />
<Lock resource_type="OBJECT" request_mode="IX" request_status="GRANT" request_count="1" />
<Lock resource_type="PAGE" page_type="*" index_name="NI_APPLIED_LOCK_EWGPS3" request_mode="IU" request_status="GRANT" request_count="1" />
</Locks>
</Object>
</Objects>
</Database>

Questions:

1. Why DELETE is slow? It took more than 50 mins and as per app team confirmation, we had to KILL the spid eventually. was it is reads more data? usually, we don't see more than 10-15 rows in this table at any moment of time. I don't know why I see a lot of reads at that time frame. please refer above screenshot.

2. From sp_wia output I was expecting 'X' locks on the table as it is a DELETE operation, but I see UPDATE locks. Any reason why update locks for delete operation?

3. Any ways we can re-write of optimize this query. this query is there been for a while.. I don't know why they are doing co-related query/sub query instead of direct where condition. not sure why?

Is there away to decrease the execution time of this query? can it be done in smaller chunks for the locks to held for shorter duration? I don't know, I am just doing my brain dump of what to ask.

4. The applied lock is kind of a table which used to hold some lock info. this is very dynamic table. A lots of inserts and deletes happen before acquiring and release this lock. Was the load on the table was more I don't know,

how to build a narrative at that time frame?

Please provide some guidance and reasoning on how can we make it faster. if there any specific questions to be asked to the app team, or if more information is needed pls let me know.

Attachment contains below info.

-Table structure

-sql stmt

-Table structure and index definitions

-Estimated plan

-few blocking screenshots

Thanks,

Sam

Attachments:

You must be logged in to view attached files.

Johan Bijnens

SSC Guru

Points: 135375
More actions
June 15, 2023 at 7:36 am
Answer
#4211722
First of all, don't use this construct of you can directly perform the delete from the "in-list-query"
```
DELETE RAL
FROM c_repos_applied_lock RAL
INNER JOIN c_repos_table RT
     ON RAL.job_type_str = 'A'
        AND RT.rowid_table = RAL.rowid_table
        AND DATEDIFF(second, RAL.create_date, CURRENT_TIMESTAMP) > RT.batch_lock_wait_seconds * 3;
```
this part of your query is non-SARGable ! " AND DATEDIFF(second, RAL.create_date, CURRENT_TIMESTAMP) > RT.batch_lock_wait_seconds * 3;" ( functions in where-clause or on-clause )
Meaning this can only be processed after the full set has been assembled ! ( and probably pumped to tempdb )
I think your c_repos_table must be smaller than your c_repos_applied_lock, so give this version a swirl:
```
DELETE RAL
FROM c_repos_applied_lock RAL
INNER JOIN c_repos_table RT
     ON RAL.job_type_str = 'A'
        AND RT.rowid_table = RAL.rowid_table
        AND  RAL.create_date < dateadd(ss, RT.batch_lock_wait_seconds * -3 , CURRENT_TIMESTAMP);
```
I don't think your "job_type_str = 'A'" will filter massively, so locking ( and lock escallation ) can be massive.
Chances are splitting and using a temptb may be a solution here:
```
Select RT.rowid_table 
        , dateadd(ss, RT.batch_lock_wait_seconds * -3 , CURRENT_TIMESTAMP) TsRef
INTO #TMP
FROM c_repos_table RT

DELETE RAL
FROM c_repos_applied_lock RAL
INNER JOIN #TMP RT
     ON RAL.job_type_str = 'A'
        AND RT.rowid_table = RAL.rowid_table
        AND  RAL.create_date < RT.TsRef;
```
TEST IT - test it - TEST IT !!!!

edited: You may also want to narrow the scope using "Delete Top ( nnnn ) ..." to avoid lock escallation.
Johan

Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Jeffrey Williams SSC Guru Points: 90304 More actions · Answer 1

Although it probably won't make a difference - I would recommend moving filtering from the JOIN to a WHERE:

DELETE RAL
FROM c_repos_applied_lock RAL
INNER JOIN c_repos_table   RT ON RT.rowid_table = RAL.rowid_table
WHERE RAL.job_type_str = 'A'
  AND RAL.create_date < dateadd(ss, RT.batch_lock_wait_seconds * -3 , CURRENT_TIMESTAMP);

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

vsamantha35 SSChampion Points: 11551 More actions · Answer 2

Thanks a lot Johan and Jeffrey.

One thing, I wanted to ask. Is there a way to run deletes in small batches to minimize blocking especially when child tables are involved i.e. CASCADING deletes are involved ensuring child deletes are also taken care in small batches.

Also, any share thoughts on Question 1 and Question 2.

Johan Bijnens SSC Guru Points: 135375 More actions · Answer 3

And here's the bite in the back by cascading deletes !

The reason why we only use foreign keys "NO ACTION" is because we want to take full control over the cleanup process.

Keep in mind cascading deletes may end up processing a whole tree structure of multiple parent - child - grandchild -... IN A SINGLE TRANSACTION !!

Taking control of the cleanup process ( " bottom to top" ) means you can also take control over the transaction size and scope.

Downside (but also a win ) is you must build the whole selection process and determine per table how many (bottom) rows you'll handle in a batch.

Looping a "delete top( @n) from bottomtable inner join #tempbottomkeys ..." until @@rowcount < @n gives you the control over the size of your transactions, so you avoid massive (b)locking and lock escalation.

Of course, you must support all foreign keys with the appropriate indexes on all tables involved.

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 4

Remember that deletes affect all indexes (except maybe filtered indexes) and causes a check for every FK pointing at the table. There's some silly wide tables at one of the companies I do work for and it took almost 6 minutes to delete just one row. The fix was to index the FK related columns on all of those other referencing tables.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Johan Bijnens SSC Guru Points: 135375 More actions · Answer 5

My guideline for FK indexes is: put indexes in place that match the FK definition 100%, unless it is proven for that case it hurts regular operations.

And even then, just disable those indexes and rebuild them during a cleanup cycle and disable afterward again.

This way they stay documented, so you remember why they have been disabled

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

Meet George Jetson SSChampion Points: 11638 More actions · Answer 6

Question to some of the panel, would setting the DEADLOCK_PRIORITY be an option?

Chris Powell

George: You're kidding.
Elroy: Nope.
George: Then lie to me and say you're kidding.

Johan Bijnens SSC Guru Points: 135375 More actions · Answer 7

It is an option, but you may end up being deadlocked most of the time, depending the amount of blocking and lock escallation

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me