September 26, 2019 at 12:46 pm
There is a stored procedure that updates pre_plan and pre_type for all associates. And it is deadlocking.
DDL of the visit and associate tables:
CREATE TABLE [dbo].[Associate](
[pre_plan_id] [smallint] NULL,
[pre_type_id] [smallint] NULL,
[associate_id] [smallint] NOT NULL,
[deleted] [bit] NOT NULL
)
INSERT INTO Associate
VALUES
(NULL, NULL, -32768, 0),
(NULL, NULL, 2, 1),
(NULL, NULL, 3, 0),
(NULL, NULL, 6, 1),
(NULL, NULL, 3097, 1),
(NULL, NULL, 3109, 0),
(NULL, NULL, 3265, 1),
(NULL, NULL, 3313, 0),
(NULL, NULL, 3318, 1),
(NULL, NULL, 3329, 0)
INSERT INTO Associate
VALUES
(NULL, NULL, -32768, 0),
(NULL, NULL, 2, 1),
(NULL, NULL, 3, 0),
(NULL, NULL, 6, 1),
(NULL, NULL, 3097, 1),
(NULL, NULL, 3109, 0),
(NULL, NULL, 3265, 1),
(NULL, NULL, 3313, 0),
(NULL, NULL, 3318, 1),
(NULL, NULL, 3329, 0)
CREATE TABLE [dbo].[Visit](
[type_id] [smallint] NOT NULL,
[plan_id] [smallint] NOT NULL,
[associate_id] [smallint] NOT NULL,
[time_in] [smalldatetime] NOT NULL
)
INSERT INTO Visit
VALUES
(390, 31, 3109, '2009-09-02'),
(304, 32, 3109, '2010-02-05'),
(388, 31, 3109, '2010-09-24'),
(388, 31, 3109, '2010-09-27'),
(388, 31, 3109, '2010-09-27'),
(388, 31, 3109, '2010-09-28'),
(388, 31, 3109, '2010-10-01'),
(333, 28, 3109, '2011-01-11'),
(338, 30, 3109, '2011-01-18'),
(388, 31, 3109, '2011-01-27')?
The stored procedure
CREATE PROCEDURE [dbo].[update_pre__]
AS
UPDATE Associate SET pre_plan_id =
(SELECT TOP 1 plan_id
FROM Visit
WHERE associate_id = Associate.associate_id
AND time_in > 90
GROUP BY plan_id
ORDER BY Count(*) DESC)
WHERE deleted = 0
UPDATE Associate SET pre_type_id =
(SELECT TOP 1 [type_id]
FROM Visit
WHERE associate_id = Associate.associate_id
AND time_in > 90
GROUP BY [type_id]
ORDER BY Count(*) DESC)
WHERE deleted = 0
I was thinking to separate the transactions adding BEGIN TRANSACTION
and COMMIT TRANSACTION
on both of the update statements. Will it help in avoiding the deadlock? Can anyone help me in suggesting most efficient way of avoiding the deadlock?
September 26, 2019 at 1:37 pm
You should be able to do this update in a single statement.
Without table DDL, sample data as INSERT statements and desired results, it's not easy to provide a working and tested solution.
But if you write a subquery which returns (associate_id, (top) plan_id, (top) type_id), you can join back to associate and perform the UPDATE.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
September 26, 2019 at 1:43 pm
@Phil, I've provided the DDL and sample data. How can I do it in a single statement? Won't separating the transactions help? Thanks.
September 26, 2019 at 1:47 pm
While separating the transactions might help, it's not as good a solution as doing it in a single hit. I'll have a look, as, I'm sure, will others.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
September 27, 2019 at 9:59 am
I was thinking to separate the transactions adding BEGIN TRANSACTION and COMMIT TRANSACTION on both of the update statements. Will it help in avoiding the deadlock? Can anyone help me in suggesting most efficient way of avoiding the deadlock?
It is (almost) impossible to avoid all deadlocks. But in almost all cases it is possible to reduce the risc of deadlocks.
Deadlocks also have the nasty habit that they are difficult to reproduce, because they depend on timing and a number of other factors.
What does help avoiding deadlocks :
In your scenario I suspect that the 'subquery' is taking some time. So what is possible to do is first determine which rows have to be updated using a similar query. And then updating the rows, so less time is taken during the actual update. If the update is an update after the fact **), then you should not have a problem. If the update is dependend on some 'aspects', those aspects can maybe be checked again in the update.
Doing this might make the actual update faster and maybe does involve less locks so reducing the chance on a deadlock.
Ben
**)
Sometimes an update is done which is 'only' depended on outside information and in not dependend on inside info. But sometimes an update is dependend on the the state of the database and that state can be altered in between finding the rows to be updated and the update itself.
September 27, 2019 at 12:24 pm
Additional:
Long time ago with did intensive research into one specific deadlock. On line and during conferences I was often told that this could not occur within SQL-server. But with did analyse the deadlock up to the point that we actually could reproduce it on the existing dataset at will within SSMS. ***)
One of the statements ****) which caused the deadlock was :
SELECT * FROM Customers WHERE Some_ID = '1234567' ;
My point:
If such a 'simple' statement can be involved in a deadlock, it is (almost) impossible to prevent deadlocks. One of the reasons being, when producing such a statement you 'can' not expect that this will cause a deadlock.
As said long time ago, but even then loads of knowledgeable people declared that this was not possible. But is was then (2002). Now probably a load of knowledgeable people will declare that this is not possible (anymore). As said I can not reproduce this (not then, not now). But I can't prove that this is impossible.
Ben
***)
We could only reproduce it on some existing datasets, we were never able to build the situation from scratch.
****)
Just this statement in one of the processes/connections, no transaction, no other statements. The other party consisted of 2 or 3 updates of the same row within a transaction. (Fairly simple updates).
September 27, 2019 at 3:42 pm
The two UPDATEs are already executed in separate transactions. Since you don't have an explicit BEGIN TRANSACTION, every statement will be treated as a separate transaction by default.
Since the subquery could cause lengthy processing, including depending on how the optimizer chooses to implement it for each run, this might be a case where you'd be better off creating a temp table first and using that to do the UPDATE.
For example:
IF OBJECT_ID('tempdb.dbo.#visit_plan_ids') IS NOT NULL
DROP TABLE #visit_plan_ids;
CREATE TABLE #visit_plan_ids (
associate_id int NOT NULL PRIMARY KEY,
plan_id int NULL
);
--TRUNCATE TABLE #visit_plan_ids;
INSERT INTO #visit_plan_ids
SELECT associate_id, plan_id
FROM (
SELECT associate_id, plan_id, --COUNT(*) AS plan_count,
ROW_NUMBER() OVER(PARTITION BY associate_id ORDER BY COUNT(*) DESC, plan_id) AS row_num
FROM dbo.Visit
GROUP BY associate_id, plan_id
) AS derived
WHERE row_num = 1
--SELECT * FROM #visit_plan_ids
UPDATE A
SET pre_plan_id = vpi.plan_id
FROM dbo.Associate A
INNER JOIN #visit_plan_ids vpi ON vpi.associate_id = A.associate_id
WHERE A.deleted = 0
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 30, 2019 at 2:54 pm
Thank you Ben and ScottPletcher! In order to tune the query, I was considering to change the subquery to a join. However, in addition to the one Scott suggested, I was also thinking this. But fall in an indifferent situation to pick one of them considering performance. Or even CTE. Any thought on this?
UPDATE a
SET a.pre_plan_id = z.plan_id
FROM
associate a
INNER JOIN
(
SELECT associate_id, plan_id, ROW_NUMBER() OVER(PARTITION BY associate_id ORDER BY COUNT(*) DESC) rown
FROM visit
WHERE time_in > 90
GROUP BY associate_id, plan_id
) z
ON z.associate_id = a.associate_id AND z.rown = 1
WHERE deleted = 0
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply