Very basic question - Insert or Update?

Question

Very basic question - Insert or Update?

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 1

Well, personally, I feel that every suggestion always should be both 'reliable' and 'atomic' - what good would it be otherwise?

Assuming that what we díscuss here is the single instance of 'update or inert', you do not need to enclose the process in an explicit transaction. In either case it's only one statement that will be executed, so it's atomic anyway. Should you have more than one table as part of the 'transaction', then it's time to do explicit trans. (but that's another story)

> Next question: how is "update followed by a row check" better or worse than an "if exists followed by an update"? In either case we will subsequently do an insert 50% of the time.

Yes, and that's the point. We save that 'overhead' so to speak. It's 'better' to go for the update right away - 50% of the time, the work is then done. This is 'cheaper' than always do a) check, then b) insert or update.

For the 'high concurrency' scenario, if you're worried that two simultaneous inserts might contend about the same new row, and you really want to try and squeeze every inch by avoiding 'unnecessary' queries - one should actually do the insert, while the other should do an update of that row, you could do it like this: (at the same time avoiding negative (ie not exists) searches)

insert myTab (id, val)

select @id, @val

from myTab t1

left join myTab t2

on t1.id = t2.id

where t1.id = @id

and t2.id is null

select @err = @@error, @rc = @@rowcount

if ( @err <> 0 ) goto errhandler

if ( @rc = 0 ) -- row (id) already exists, to avoid PK violation, we need an update instead

begin

update myTab

set val = @val

where id = @id

end

Admittedly, the above is a bit harder to understand, but essentially what we're doing is that we go for the INSERT first - and to avoid an eventual PK violation, the insert is based on a left self join, testing for the existence of the PK. If it exists, the insert itself will return 0 rows, so we then know by looking at @@ROWCOUNT that if we should do an update or not. Shold we enter the described scenario that two simultaneous connections try to insert the same PK, (because both have by some method retrieved the same PK value, no matter how), only one at a time can actually attempt the insert operation - SQL Server itself will see to that. Now, I haven't tested this down to the milliseconds (considering rowlocks and all might actually allow to exact simultaneous inserts anyway), though that is probably unlikely.. You need that to happen, and also the situation of more than one actually trying to add a new row with the same new PK for it to break...

The critical part here is what goes into the ON and WHERE clauses of the left join. If it's not right, it won't work as expected.

And, ultimately - it all depends

If the difference is measurable, or even worth the effort will of course depend. I just wanted to show another 'angle' to the 'problem'.

/Kenneth

SQL-DBA Hall of Fame Points: 3051 More actions · Answer 2

This doesn't include the Else...Insert statement, and this is in reference to summarizing about 50,000 rows every night, but here is my question:

If I choose the IF EXISTS method (then update), am I not evaluating the WHERE clause in the Update statement anyway?

-- Create temp table to hold views data

If Object_Id('tempdb..#tmpFactViewsData') is Not Null

DROP TABLE [dbo].[#tmpFactViewsData]

CREATE TABLE #tmpFactViewsData

(

RptYear smallint,

RptMonth smallint,

LanguageCode nvarchar(2),

CountryCode nvarchar(2),

ApplicationID int,

ProductRangeID int,

OSID int,

ProductID int,

UserGroupID int,

RptCount int

&nbsp

INSERT INTO #tmpFactViewsData

SELECT Year(FCD.ViewDate) AS RptYear, Month(FCD.ViewDate) AS RptMonth, FCD.LanguageCode, FCD.CountryCode, FCD.ApplicationID, FCD.ProductRangeID, FCD.OSID, FCD.ProductID, UserGroupID, COUNT(*) AS RptCount

FROM FactCommonDaily FCD

WHERE FCD.Treated = 1

GROUP BY Year(FCD.ViewDate), Month(FCD.ViewDate), FCD.LanguageCode, FCD.CountryCode, FCD.ApplicationID, FCD.ProductRangeID, FCD.OSID, FCD.ProductID, UserGroupID

IF EXISTS(SELECT FVD.RptYear, FVD.RptMonth, FVD.LanguageCode, FVD.CountryCode, FVD.ApplicationID, FVD.ProductRangeID, FVD.OSID, FVD.ProductID, FVD.UserGroupID

FROM FactViewsData FVD, #tmpFcatViewsData tFVD

WHERE FVD.RptYear = tFVD.RptYear

AND FVD.RptMonth = tFVD.RptMonth

AND FVD.LanguageCode = tFVD.LanguageCode

AND FVD.CountryCode = tFVD.CountryCode

AND FVD.ApplicationID = tFVD.ApplicationID

AND FVD.ProductRangeID = tFVD.ProductRangeID

AND FVD.OSID = tFVD.OSID

AND FVD.ProductID = tFVD.ProductID

AND FVD.UserGroupID = tFVD.UserGroupID)

-- If EXISTS then update table and add to the RptCount

UPDATE FactViewsData FVD

SET FVD.RptCount = FVD.RptCount+#tmpFactViewsData.RptCount

WHERE FVD.RptYear = tFVD.RptYear

AND FVD.RptMonth = tFVD.RptMonth

AND FVD.LanguageCode = tFVD.LanguageCode

AND FVD.CountryCode = tFVD.CountryCode

AND FVD.ApplicationID = tFVD.ApplicationID

AND FVD.ProductRangeID = tFVD.ProductRangeID

AND FVD.OSID = tFVD.OSID

AND FVD.ProductID = tFVD.ProductID

AND FVD.UserGroupID = tFVD.UserGroupID

I'm not seeing the value of the IF EXISTS statement....what am I missing???? Thanks.

SQL-DBA Hall of Fame Points: 3051 More actions · Answer 3

Annoying that a closing parenthesis by itself turns into a winking smiley...

RptCount int

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 4

No, it's quite unnecessary.

If the update was attemped straight off, you could always check @@rowcount for how many rows was affected by the update. There is no error if rowcount is zero, so the EXISTS check doesn't add any value here.

/Kenneth