Best method for Update then insert with single transaction

Question

Best method for Update then insert with single transaction

fergfamster

SSC Eights!

Points: 841
More actions
July 25, 2016 at 11:55 am

#400464

Guys-
I have a table with 23 billion rows so Merge is out, as it scans the entire thing and has become a hog. So I'm going back in time to 2005 and before for this example.
The question that I have is, as the statement loops over the update, and the @@rowcount is hit, how can I use those values it tried updating with the insert statement section without rewriting a secondary query? In other words, I am performing an update first and if that is not performed I want to then insert that same row. The below is what seems to be an accepted answer from before Merge. The temp table can have close to 23 million records and the primary keys are SDP_ID and Meter_Read_Dttm. Any help or pointers on how to do this efficiently are appreciated.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
UPDATE Primary_Table SET SDP_EffBegDate=temp.SDP_EffBegDate,
Meter_Read_Status_Id=temp.Meter_Read_Status_Id,
Src_Del_Flag=temp.Src_Del_Flag,
ETL_Batch_Id=temp.ETL_Batch_Id,
Meter_Read=temp.Meter_Read
FROM Temp_Table temp, Primary_Table AS Hourly
WHERE hourly.SDP_Id = temp.SDP_Id and Hourly.Meter_Read_Dttm = Temp.Meter_Read_Dttm and Hourly.Meter_Read_Dttm >= '2016-01-01 00:00:00';
IF @@ROWCOUNT = 0
BEGIN
INSERT Meter_Read_KWH_Hourly (
SDP_Id,
SDP_EffBegDate,
Meter_Read_Status_Id,
ETL_Batch_Id,
Src_Del_Flag,
Meter_Read_Dttm,
Meter_Read) (Select clause or variables or?);
END
COMMIT TRANSACTION;

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 1

fergfamster (7/25/2016)
Guys-
I have a table with 23 billion rows so Merge is out, as it scans the entire thing and has become a hog. So I'm going back in time to 2005 and before for this example.
The question that I have is, as the statement loops over the update, and the @@rowcount is hit, how can I use those values it tried updating with the insert statement section without rewriting a secondary query? In other words, I am performing an update first and if that is not performed I want to then insert that same row. The below is what seems to be an accepted answer from before Merge. The temp table can have close to 23 million records and the primary keys are SDP_ID and Meter_Read_Dttm. Any help or pointers on how to do this efficiently are appreciated.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
UPDATE Primary_Table SET SDP_EffBegDate=temp.SDP_EffBegDate,
Meter_Read_Status_Id=temp.Meter_Read_Status_Id,
Src_Del_Flag=temp.Src_Del_Flag,
ETL_Batch_Id=temp.ETL_Batch_Id,
Meter_Read=temp.Meter_Read
FROM Temp_Table temp, Primary_Table AS Hourly
WHERE hourly.SDP_Id = temp.SDP_Id and Hourly.Meter_Read_Dttm = Temp.Meter_Read_Dttm and Hourly.Meter_Read_Dttm >= '2016-01-01 00:00:00';
IF @@ROWCOUNT = 0
BEGIN
INSERT Meter_Read_KWH_Hourly (
SDP_Id,
SDP_EffBegDate,
Meter_Read_Status_Id,
ETL_Batch_Id,
Src_Del_Flag,
Meter_Read_Dttm,
Meter_Read) (Select clause or variables or?);
END
COMMIT TRANSACTION;

This is basically what I would do, ~~but there appears to be some issues with your UPDATE~~.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;

UPDATE Primary_Table SET

SDP_EffBegDate = temp.SDP_EffBegDate,

Meter_Read_Status_Id = temp.Meter_Read_Status_Id,

Src_Del_Flag = temp.Src_Del_Flag,

ETL_Batch_Id = temp.ETL_Batch_Id,

Meter_Read = temp.Meter_Read

FROM

Temp_Table temp

INNER JOIN Primary_Table AS Hourly

on (hourly.SDP_Id = temp.SDP_Id and

Hourly.Meter_Read_Dttm = Temp.Meter_Read_Dttm and

Hourly.Meter_Read_Dttm >= '2016-01-01 00:00:00');

--IF @@ROWCOUNT = 0 -- What if there are values that need to be updated as well as inserted from Temp_Table?

INSERT Meter_Read_KWH_Hourly (

SDP_Id,

SDP_EffBegDate,

Meter_Read_Status_Id,

ETL_Batch_Id,

Src_Del_Flag,

Meter_Read_Dttm,

Meter_Read)

Select

temp.SDP_Id,

temp.SDP_EffBegDate,

temp.Meter_Read_Status_Id,

temp.ETL_Batch_Id,

temp.Src_Del_Flag,

temp.Meter_Read_Dttm,

temp.Meter_Read

FROM

Temp_Table temp

WHERE

NOT EXISTS(SELECT 1 FROM Primary_Table Hourly

WHERE

hourly.SDP_Id = temp.SDP_Id and

Hourly.Meter_Read_Dttm = Temp.Meter_Read_Dttm);

COMMIT TRANSACTION;

Figured out my issue with your UPDATE statement, I couldn't see the equals. Please use white space, the computer may not care but it makes things easier to read by humans.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

http://source.entelect.co.za/why-is-this-upsert-code-broken

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

fergfamster SSC Eights! Points: 841 More actions · Answer 3

Thanks Lynn, Ill try harder to use whitespace the next time I post.... You dont think using the @@rowcount is useful? Better to just do both statements?

fergfamster SSC Eights! Points: 841 More actions · Answer 4

fergfamster

SSC Eights!

Points: 841

July 25, 2016 at 1:04 pm

#1891897

Thanks Gail, useful article!

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

fergfamster (7/25/2016)
You dont think using the @@rowcount is useful? Better to just do both statements?

No, because what if there's 6 rows that need inserting and 4 updating? To use rowcount, you have to loop over those 10 rows. With Lynn's way you just run both statements, the update affects the 4 existing rows and the insert affects the 6 rows that weren't in the table at the start.

The exists needs lock hints though (as in my article), otherwise there's still a chance of ~~duplicates~~ deadlocks if there are multiple sessions running this concurrently.

Edit: corrected last line

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

fergfamster SSC Eights! Points: 841 More actions · Answer 6

fergfamster

SSC Eights!

Points: 841

July 25, 2016 at 1:16 pm

#1891907

Thanks Guys and Gals! Appreciate the help!

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 7

GilaMonster (7/25/2016)
fergfamster (7/25/2016)
You dont think using the @@rowcount is useful? Better to just do both statements?
No, because what if there's 6 rows that need inserting and 4 updating? To use rowcount, you have to loop over those 10 rows. With Lynn's way you just run both statements, the update affects the 4 existing rows and the insert affects the 6 rows that weren't in the table at the start.
The exists needs lock hints though (as in my article), otherwise there's still a chance of duplicates if there are multiple sessions running this concurrently.

Even with this? Just want to be sure.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

Lynn Pettis (7/25/2016)
GilaMonster (7/25/2016)
fergfamster (7/25/2016)
You dont think using the @@rowcount is useful? Better to just do both statements?
No, because what if there's 6 rows that need inserting and 4 updating? To use rowcount, you have to loop over those 10 rows. With Lynn's way you just run both statements, the update affects the 4 existing rows and the insert affects the 6 rows that weren't in the table at the start.
The exists needs lock hints though (as in my article), otherwise there's still a chance of duplicates if there are multiple sessions running this concurrently.
Even with this? Just want to be sure.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

Sorry, not duplicates. Deadlocks.

Multiple sessions take a shared range lock, multiple sessions go to convert the lock to X for the insert, can't be converted due to the shared range locks, instant deadlock.

I'd recommend use the locking hints and not the elevated isolation level, and it only needs the transaction if it's there for rollbacks on errors (and then with a try catch block).

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 9

GilaMonster (7/25/2016)
Lynn Pettis (7/25/2016)
GilaMonster (7/25/2016)
fergfamster (7/25/2016)
You dont think using the @@rowcount is useful? Better to just do both statements?
No, because what if there's 6 rows that need inserting and 4 updating? To use rowcount, you have to loop over those 10 rows. With Lynn's way you just run both statements, the update affects the 4 existing rows and the insert affects the 6 rows that weren't in the table at the start.
The exists needs lock hints though (as in my article), otherwise there's still a chance of duplicates if there are multiple sessions running this concurrently.
Even with this? Just want to be sure.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Sorry, not duplicates. Deadlocks.
Multiple sessions take a shared range lock, multiple sessions go to convert the lock to X for the insert, can't be converted due to the shared range locks, instant deadlock.
I'd recommend use the locking hints and not the elevated isolation level, and it only needs the transaction if it's there for rollbacks on errors (and then with a try catch block).

I have started code for deadlocks in some of the code I have written lately. Unfortunately, when it comes to some of the developers code that is embedded in the code, not much luck as I don't get to see it before it goes to production. If there are problems in production I usually can't see it there either as the production software runs on secured networks.