April 25, 2020 at 2:45 pm
I'm currently investigating a deadlock issue on a heavily used database but still wasn't able to reproduce it even using concurrent ostress sessions or WHILE 1=1 EXEC StoredProcs. I'd be really glad anyone could shed some light on ways to reproduce it and also help me understand it's behavior.
There are two tables involved and they're partitioned by a hash in a computed column based on the original PK. They were partitioned due to page latch issues on INSERTs in the past.
Deadlock graph:
deadlock-list
deadlock victim=process48094508
process-list
process id=process48094508 taskpriority=0 logused=428 waitresource=KEY: 10:72057666620227584 (d79f02e56828) waittime=2832 ownerId=28655562221 transactionname=implicit_transaction lasttranstarted=2020-04-19T09:37:13.823 XDES=0x7d33ef970 lockMode=X schedulerid=15 kpid=22212 status=suspended spid=2684 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.857 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655562221 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128058
executionStack
frame procname=MYDB.dbo.pInsVarConn line=14 stmtstart=470 stmtend=690 sqlhandle=0x03000b00a3f2cc6277eb2bc0b8a900000100000000000000
INSERT INTO tVarConn (nConn, iVarConn, rVarConn)
VALUES (@nConn, @iVarConn, @rVarConn)
frame procname=MYDB.dbo.pPrcInsVarConnao01 line=66 stmtstart=4890 stmtend=5016 sqlhandle=0x03000a004c202f3094703d009e1b00000100000000000000
EXEC dbo.pInsVarConn @nConn, @iVarConn, @rVarConn
frame procname=adhoc line=1 stmtstart=96 sqlhandle=0x01000a00b3117f0a40e2b01b0f0000000000000000000000
EXEC pPrcInsVarConnao01 @P0,@P1,@P2
inputbuf
(@P0 bigint,@P1 varchar(8000),@P2 varchar(8000))EXEC pPrcInsVarConnao01 @P0,@P1,@P2
process id=process48153948 taskpriority=0 logused=11224 waitresource=KEY: 10:72057666620227584 (0a7b9247f732) waittime=2832 ownerId=28655563535 transactionname=DELETE lasttranstarted=2020-04-19T09:37:13.857 XDES=0x2760ee83b0 lockMode=X schedulerid=34 kpid=31228 status=suspended spid=2553 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.263 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655563535 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128056
executionStack
frame procname=MYDB.dbo.pPrcEndConnao02 line=54 stmtstart=2480 stmtend=2666 sqlhandle=0x03000a00d95fa90184eb2b00b8aa00000100000000000000
DELETE FROM dbo.tVarConn WHERE nConn = @IdConnao AND HashID = @HashID
frame procname=adhoc line=1 stmtstart=132 sqlhandle=0x01000a00a7b5550840229a10010d00000000000000000000
exec pPrcEndConnao02 @P0,@P1,@P2,@P3
frame procname=unknown line=1 sqlhandle=0x000000000000000000000000000000000000000000000000
unknown
inputbuf
(@P0 bigint,@P1 varchar(8000),@P2 varchar(8000),@P3 varchar(8000))exec pPrcEndConnao02 @P0,@P1,@P2,@P3
resource-list
keylock hobtid=72057666620227584 dbid=10 objectname=MYDB.dbo.tVarConn indexname=PKtVarConn id=lock495af880 mode=X associatedObjectId=72057666620227584
owner-list
owner id=process48153948 mode=X
waiter-list
waiter id=process48094508 mode=X requestType=wait
keylock hobtid=72057666620227584 dbid=10 objectname=MYDB.dbo.tVarConn indexname=PKtVarConn id=lock1d7ff99b80 mode=X associatedObjectId=72057666620227584
owner-list
owner id=process48094508 mode=X
waiter-list
waiter id=process48153948 mode=X requestType=wait
Thanks to this nice article http://www.sqlnotes.info/2012/06/26/locking-behavior-fks/(Found it to be easier to do this way when doing several times - don't have to issue SELECTs and DBCC PAGE), while trying to reproduce the scenario, I found out that the INSERT statement acquires the following locks:
And the DELETE's:
Plans:
What am I missing? How can these two get involved in the deadlock above? How can I reproduce it so I can test possible solutions?
Sorry if this got too long, I wanted to be throught..
<!--more-->
Parent table:
Name Owner Type
--------- -------- -------------
tConn dbo user table
Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation
-------------- ----------- ----------- --------- ----- ----- ----------- ---------------------- ------------------------- ----------------------------
nConn int no 4 10 0 no (n/a) (n/a) NULL
Col1 smallint no 2 5 0 no (n/a) (n/a) NULL
Col2 char no 23 no no no SQL_Latin1_General_CP1_CI_AI
Col3 datetime no 8 no (n/a) (n/a) NULL
Col4 char no 20 no no no SQL_Latin1_General_CP1_CI_AI
Col5 datetime no 8 no (n/a) (n/a) NULL
Col6 varchar no 50 yes no yes SQL_Latin1_General_CP1_CI_AI
Col7 char no 1 no no no SQL_Latin1_General_CP1_CI_AI
Col8 smallint no 2 5 0 no (n/a) (n/a) NULL
Col9 smallint no 2 5 0 no (n/a) (n/a) NULL
Col10 smallint no 2 5 0 no (n/a) (n/a) NULL
Col11 decimal no 9 10 0 no (n/a) (n/a) NULL
Col12 decimal no 5 5 0 yes (n/a) (n/a) NULL
HashID tinyint yes 1 3 0 no (n/a) (n/a) NULL
Identity Seed Increment Not For Replication
----------- ------ ----------- -------------------
nConn 1 1 0
Data_located_on_filegroup
--------------------------
schPartConn
index_name index_description index_keys
------------- ------------------------------------------------------ -----------------
Idx1tConn nonclustered located on schPartConn Col2
Idx2tConn nonclustered located on schPartConn Col1, Col4, Col7
PKtConn clustered, unique, primary key located on schPartConn HashID, nConn
constraint_type constraint_name delete_action update_action status_enabled status_for_replication constraint_keys
------------------------- ----------------- ------------- ------------- -------------- ---------------------- ----------------------------------
FOREIGN KEY FktAplic01 No Action No Action Enabled Is_For_Replication Col1
REFERENCES MYDB.dbo.tAplic (Col1)
PRIMARY KEY (clustered) PKtConn (n/a) (n/a) (n/a) (n/a) HashID, nConn
Table is referenced by foreign key
-----------------------------------
MYDB.dbo.tVarConn: FktConn01
Child table:
Name Owner Type
---------- ------- -----------
tVarConn dbo user table
Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation
-------------- ---------- ----------- -------- ----- ----- ----------- --------------------- ---------------------- -----------------------------
nConn int no 4 10 0 no (n/a) (n/a) NULL
iVarConn char no 20 no no no SQL_Latin1_General_CP1_CI_AI
rVarConn varchar no 8000 no no no SQL_Latin1_General_CP1_CI_AI
HashID tinyint yes 1 3 0 no (n/a) (n/a) NULL
Identity Seed Increment Not For Replication
----------------------------- ------ ----------- -------------------
No identity column defined. NULL NULL NULL
Data_located_on_filegroup
---------------------------
schPartConn
index_name index_description index_keys
--------------- -------------------------------------------------------- ------------------------
PKtVarConn clustered, unique, primary key located on schPartConn HashID, nConn, iVarConn
constraint_type constraint_name delete_action update_action status_enabled status_for_replication constraint_keys
------------------------ ------------------ ------------- ------------- -------------- ---------------------- ----------------
FOREIGN KEY FktConn01 Cascade No Action Enabled Is_For_Replication HashID, nConn
REFERENCES MYDB.dbo.tConn (HashID, nConn)
PRIMARY KEY (clustered) PKtVarConn (n/a) (n/a) (n/a) (n/a) HashID, nConn, iVarConn
No foreign keys reference table 'tVarConn', or you do not have permissions on referencing tables.
Stored Procs:
CREATE PROCEDURE [dbo].[pInsVarConn] (
@nConn int,
@iVarConn char(20),
@rVarConn varchar(8000)
)
AS
BEGIN
SET NOCOUNT ON
INSERT INTO tVarConn (nConn, iVarConn, rVarConn)
VALUES (@nConn, @iVarConn, @rVarConn)
END
go
CREATE PROCEDURE [dbo].[pPrcInsVarConnao01] (
@nConn int,
@iVarConn char(20),
@rVarConn varchar(8000)
)
AS
BEGIN
SET NOCOUNT ON
DECLARE @HashID tinyint
SET @HashID =CONVERT([tinyint],abs(@nConn%(72)),(0))
IF EXISTS(SELECT TOP 1 1 FROM tVarConn WITH(NOLOCK) WHERE nConn = @nConn AND HashID = @HashID AND iVarConn = @iVarConn)
BEGIN
exec [pUpdtVarConn] @nConn, @iVarConn, @rVarConn
END
ELSE
BEGIN
exec [pInsVarConn] @nConn, @iVarConn, @rVarConn
END
END
go
CREATE PROCEDURE [dbo].[pPrcEndConnao02](
@IdConnao int,
@SeqGen char(23),
@CurrEnd char(50),
@Valid char(1) = 'N'
)
AS
DECLARE @HashID tinyint
DECLARE @EndUs char(50)
DECLARE @StConnao char(1)
DECLARE @RC smallint
BEGIN
SET NOCOUNT ON
SET @HashID =CONVERT([tinyint],abs(@IdConnao%(72)),(0))
SELECT
@EndUs = Col6,
@StConnao = Col7
FROM
dbo.tConn WITH(NOLOCK)
WHERE
nConn = @IdConnao AND
HashID = @HashID AND
Col2 = @SeqGen
IF(@@ROWCOUNT = 0)
BEGIN
SET @RC = -5014
END
ELSE
BEGIN
IF(@stConnao = 'F')
BEGIN
SET @RC = -5012
END
ELSE
BEGIN
UPDATE dbo.tConn
SET Col7 = 'F'
WHERE nConn = @IdConnao
AND HashID = @HashID
AND Col2 = @SeqGen
DELETE FROM dbo.tVarConn WHERE nConn = @IdConnao AND HashID = @HashID
SET @RC = 0
END
END
SELECT @RC AS Ret
END
Thanks a bunch!
FT
April 25, 2020 at 3:16 pm
I didn't study this in detail, but one thing I have noticed that people miss in deadlock analysis is that the processes may be involved in a multi-statement transaction. Look here:
process id=process48094508 taskpriority=0 logused=428 waitresource=KEY: 10:72057666620227584 (d79f02e56828) waittime=2832 ownerId=28655562221 transactionname=implicit_transaction lasttranstarted=2020-04-19T09:37:13.823 XDES=0x7d33ef970 lockMode=X schedulerid=15 kpid=22212 status=suspended spid=2684 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.857 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655562221 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128058
So this process is running a transaction, which started in a previous request from the client. Thus, it may be blocks from a previous operation that is blocking the other process. (That process appears to be running a single-statement transaction.)
The isolation level Repeatable Read means that locks are held longer, although it may not matter here since the collision is on two X-locks.
[font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]
April 25, 2020 at 3:28 pm
-- Edit: Forgot to quote a reply
April 25, 2020 at 3:45 pm
I didn't study this in detail, but one thing I have noticed that people miss in deadlock analysis is that the processes may be involved in a multi-statement transaction. Look here:
process id=process48094508 taskpriority=0 logused=428 waitresource=KEY: 10:72057666620227584 (d79f02e56828) waittime=2832 ownerId=28655562221 transactionname=implicit_transaction lasttranstarted=2020-04-19T09:37:13.823 XDES=0x7d33ef970 lockMode=X schedulerid=15 kpid=22212 status=suspended spid=2684 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.857 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655562221 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128058
Thanks for the reply Erland. I know the post ended up being a massive amount of info... Sorry!
Anyways, the implicit transaction is executing pPrcInsVarConnao01, which doesn't open explicit transactions and exec pInsVarConn. This other proc runs a plain INSERT INTO tVarConn (nConn, iVarConn, rVarConn) VALUES (@nConn, @iVarConn, @rVarConn).
Could this implicit_transaction hide a distributed tran/transaction open before the stored proc call?
April 25, 2020 at 4:02 pm
The DELETE statement is deleting multiple rows with the same (nConn, HashId). I would guess the other process is trying to insert rows with the same key values. It seems that is now trying to insert an rVarConn the DELETE has deleted. The DELETE in its turn is trying to delete a row which the other process inserted earlier in the same transaction.
I can't say that those NOLOCK that appears in the code gives me any confidence.
[font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]
May 8, 2020 at 12:21 pm
I didn't study this in detail, but one thing I have noticed that people miss in deadlock analysis is that the processes may be involved in a multi-statement transaction. Look here:
process id=process48094508 taskpriority=0 logused=428 waitresource=KEY: 10:72057666620227584 (d79f02e56828) waittime=2832 ownerId=28655562221 transactionname=implicit_transaction lasttranstarted=2020-04-19T09:37:13.823 XDES=0x7d33ef970 lockMode=X schedulerid=15 kpid=22212 status=suspended spid=2684 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.857 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655562221 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128058
So this process is running a transaction, which started in a previous request from the client. Thus, it may be blocks from a previous operation that is blocking the other process. (That process appears to be running a single-statement transaction.)
The isolation level Repeatable Read means that locks are held longer, although it may not matter here since the collision is on two X-locks.
You were absolutely right.
Found out that there were resources being locked in a outer transaction opened by the java app which was disabling the autocommit setting to run several stored procedures and then commit.
Thanks again, FT
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply