January 25, 2005 at 2:51 am
Well, personally, I feel that every suggestion always should be both 'reliable' and 'atomic' - what good would it be otherwise?
Assuming that what we díscuss here is the single instance of 'update or inert', you do not need to enclose the process in an explicit transaction. In either case it's only one statement that will be executed, so it's atomic anyway. Should you have more than one table as part of the 'transaction', then it's time to do explicit trans. (but that's another story)
> Next question: how is "update followed by a row check" better or worse than an "if exists followed by an update"? In either case we will subsequently do an insert 50% of the time.
Yes, and that's the point. We save that 'overhead' so to speak. It's 'better' to go for the update right away - 50% of the time, the work is then done. This is 'cheaper' than always do a) check, then b) insert or update.
For the 'high concurrency' scenario, if you're worried that two simultaneous inserts might contend about the same new row, and you really want to try and squeeze every inch by avoiding 'unnecessary' queries - one should actually do the insert, while the other should do an update of that row, you could do it like this: (at the same time avoiding negative (ie not exists) searches)
insert myTab (id, val)
select @id, @val
from myTab t1
left join myTab t2
on t1.id = t2.id
where t1.id = @id
and t2.id is null
select @err = @@error, @rc = @@rowcount
if ( @err <> 0 ) goto errhandler
if ( @rc = 0 ) -- row (id) already exists, to avoid PK violation, we need an update instead
begin
update myTab
set val = @val
where id = @id
end
Admittedly, the above is a bit harder to understand, but essentially what we're doing is that we go for the INSERT first - and to avoid an eventual PK violation, the insert is based on a left self join, testing for the existence of the PK. If it exists, the insert itself will return 0 rows, so we then know by looking at @@ROWCOUNT that if we should do an update or not. Shold we enter the described scenario that two simultaneous connections try to insert the same PK, (because both have by some method retrieved the same PK value, no matter how), only one at a time can actually attempt the insert operation - SQL Server itself will see to that. Now, I haven't tested this down to the milliseconds (considering rowlocks and all might actually allow to exact simultaneous inserts anyway), though that is probably unlikely.. You need that to happen, and also the situation of more than one actually trying to add a new row with the same new PK for it to break...
The critical part here is what goes into the ON and WHERE clauses of the left join. If it's not right, it won't work as expected.
And, ultimately - it all depends
If the difference is measurable, or even worth the effort will of course depend. I just wanted to show another 'angle' to the 'problem'.
/Kenneth
February 9, 2006 at 7:11 am
This doesn't include the Else...Insert statement, and this is in reference to summarizing about 50,000 rows every night, but here is my question:
If I choose the IF EXISTS method (then update), am I not evaluating the WHERE clause in the Update statement anyway?
-- Create temp table to hold views data
If Object_Id('tempdb..#tmpFactViewsData') is Not Null
DROP TABLE [dbo].[#tmpFactViewsData]
CREATE TABLE #tmpFactViewsData
(
RptYear smallint,
RptMonth smallint,
LanguageCode nvarchar(2),
CountryCode nvarchar(2),
ApplicationID int,
ProductRangeID int,
OSID int,
ProductID int,
UserGroupID int,
RptCount int
 
INSERT INTO #tmpFactViewsData
SELECT Year(FCD.ViewDate) AS RptYear, Month(FCD.ViewDate) AS RptMonth, FCD.LanguageCode, FCD.CountryCode, FCD.ApplicationID, FCD.ProductRangeID, FCD.OSID, FCD.ProductID, UserGroupID, COUNT(*) AS RptCount
FROM FactCommonDaily FCD
WHERE FCD.Treated = 1
GROUP BY Year(FCD.ViewDate), Month(FCD.ViewDate), FCD.LanguageCode, FCD.CountryCode, FCD.ApplicationID, FCD.ProductRangeID, FCD.OSID, FCD.ProductID, UserGroupID
IF EXISTS(SELECT FVD.RptYear, FVD.RptMonth, FVD.LanguageCode, FVD.CountryCode, FVD.ApplicationID, FVD.ProductRangeID, FVD.OSID, FVD.ProductID, FVD.UserGroupID
FROM FactViewsData FVD, #tmpFcatViewsData tFVD
WHERE FVD.RptYear = tFVD.RptYear
AND FVD.RptMonth = tFVD.RptMonth
AND FVD.LanguageCode = tFVD.LanguageCode
AND FVD.CountryCode = tFVD.CountryCode
AND FVD.ApplicationID = tFVD.ApplicationID
AND FVD.ProductRangeID = tFVD.ProductRangeID
AND FVD.OSID = tFVD.OSID
AND FVD.ProductID = tFVD.ProductID
AND FVD.UserGroupID = tFVD.UserGroupID)
-- If EXISTS then update table and add to the RptCount
UPDATE FactViewsData FVD
SET FVD.RptCount = FVD.RptCount+#tmpFactViewsData.RptCount
WHERE FVD.RptYear = tFVD.RptYear
AND FVD.RptMonth = tFVD.RptMonth
AND FVD.LanguageCode = tFVD.LanguageCode
AND FVD.CountryCode = tFVD.CountryCode
AND FVD.ApplicationID = tFVD.ApplicationID
AND FVD.ProductRangeID = tFVD.ProductRangeID
AND FVD.OSID = tFVD.OSID
AND FVD.ProductID = tFVD.ProductID
AND FVD.UserGroupID = tFVD.UserGroupID
I'm not seeing the value of the IF EXISTS statement....what am I missing???? Thanks.
February 9, 2006 at 7:14 am
Annoying that a closing parenthesis by itself turns into a winking smiley...
RptCount int
February 10, 2006 at 3:06 am
No, it's quite unnecessary.
If the update was attemped straight off, you could always check @@rowcount for how many rows was affected by the update. There is no error if rowcount is zero, so the EXISTS check doesn't add any value here.
/Kenneth
Viewing 4 posts - 16 through 18 (of 18 total)
You must be logged in to reply to this topic. Login to reply