July 1, 2004 at 3:05 pm
I have this update that hits about 50,000 rows. However it takes about 5-7 minutes which is too long for our OLTP application. This process is only run once a month and hits a table that is used often by other application.
Is there any way to break down the update so it will only update 1000 rows at a time and not cause any blocking? I was thinking of either a temp table or table variable, but am concerned about using resources from tempdb.
UPDATE email_status_master
SET global_opt_out_ind = 1
FROM email_log l(nolock) left join email_status_master m (nolock)
on l.consumer_id = m.consumer_id
WHERE l.email_type_cd = 1
and email_sent_date = '2004-04-05'
and email_sent_status = 2 --unsubscribed
and m.consumer_id is not null
GO
Please advise
Thanks
July 1, 2004 at 4:37 pm
You could do soemthing liek so.
DECLARE @rowcnt int
SET @rowcnt = 1
SET ROWCOUNT 1000
WHILE @rowcnt > 0
BEGIN
--UPDATE STATEMENT HERE
SET @rowcnt = @@ROWCOUNT
END
This should do the changes in batch sizes of 1000 each loop. Might even add a WAITFOR { DELAY 'time' | TIME 'time' } to pause before end between each run a bit of time.
July 1, 2004 at 5:50 pm
Thanks for the help, I thought of that but couldn't that possibly update the same data?
How do I ensure all rows needed are being update?
July 1, 2004 at 6:15 pm
Your UPDATE statement needs to contain a WHERE clause which specifically excludes all records which have already been updated - and this will, of course, depend on your update logic.
Note also that BOL suggests that: "It is recommended that DELETE, INSERT, and UPDATE statements currently using SET ROWCOUNT be rewritten to use the TOP syntax" – just tried doing this and it works, though the query is less elegant-looking to my eyes.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
July 2, 2004 at 1:10 am
woow that must be some critical application if it cant be down for 7 minutes in 30 days .... without those 7 minutes it would be 99.983 % availability
Bas Schouten
CBIS BV
July 2, 2004 at 3:23 am
Why left join ???,
I think that is where the performance hit is.
You could rewrite sql like this:
UPDATE email_status_master
SET global_opt_out_ind = 1
FROM email_log l(nolock)
WHERE l.consumer_id = email_status_master.consumer_id
l.email_type_cd = 1
and email_status_master.email_sent_date = '2004-04-05'
and email_status_master.email_sent_status = 2 --unsubscribed
and email_status_master.global_opt_out_ind <> 1 -- add this so that you don't already updated records
Or if you prefer:
UPDATE email_status_master
SET global_opt_out_ind = 1
FROM email_log l(nolock) join email_status_master m (nolock)
on l.consumer_id = m.consumer_id
WHERE l.email_type_cd = 1
and email_sent_date = '2004-04-05'
and email_sent_status = 2 --unsubscribed
and global_opt_out_ind <> 1 -- add this so that you don't already updated records
-- and m.consumer_id is not null -- not necessary
You can verify that execution plans are the same for both queries.
I think that the first way is clearer to read.
/rockmoose
You must unlearn what You have learnt
July 7, 2004 at 11:25 am
In case anyone cares.. Here's what I ended up using..
CREATE proc UpdateEmailStatusMaster
@process_date datetime
as
declare @email_sent_date datetime
Declare @updateemailsm table (pk int identity(1,1), consumer_id Int, email_sent_date datetime, email_type_cd int, email varchar(60), email_sent_status int, global_opt_out_ind int, status int, last_update_ts datetime)
declare @rows Int
set rowcount 100000
set @email_sent_date = @process_date
insert into @updateemailsm
select l.consumer_id, l.email_sent_date, l.email_type_cd, l.email, l.email_sent_status, m.global_opt_out_ind, m.status, m.last_update_ts
FROM consumer.dbo.email_log l(nolock) left join consumer.dbo.email_status_master m (nolock)
on l.consumer_id = m.consumer_id
WHERE l.email_type_cd = 1
and email_sent_date = @email_sent_date
and email_sent_status = 2 --unsubscribed
and m.consumer_id is not null
Set @rows = 1000
while exists (select top 1 'true' from @updateemailsm)
Begin
begin transaction emails
UPDATE consumerrepair.dbo.email_status_master
SET global_opt_out_ind = 1
--select u.consumer_id
from consumer..email_status_master m left join @updateemailsm u
on m.consumer_id = u.consumer_id
WHERE
u.pk <=@rows
Delete from @updateemailsm where pk <= @rows
commit transaction emails
Set @rows = @rows + 1000
End
GO
July 7, 2004 at 12:07 pm
In this last DDL I see a few places you could speed this up. First since you moved the data into the temp table to do the update. Make sure that the only data in the temp table is the PK value you want to set the record on plus an IDENTITY COLUMN. Then join the temp table to the consumerrepair table on the PK values using an INNER JOIN. This makes the temp table smaller as you don't need the extra columns. Also since you did the outer join to create the temp table you already have the PK values you want to update. There is no reason to do the outer join. And in fact the outer join will cause you problems as you will update more data than you want.
UPDATE m
SET global_opt_out_ind = 1
FROM consumerrepair.dbo.email_status_master m
INNER JOIN @updateemailsm u ON m.consumer_id = u.consumer_id
WHERE u.pk <=@rows
Also note that it may be faster to use an actual temp table with an index rather than a table variable. You would need to test this out to make sure though.
Gary Johnson
Microsoft Natural Language Group
DBA, Sr. DB Engineer
This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed in this post are my own and may not reflect that of my employer.
July 7, 2004 at 2:13 pm
Thanks for your info.
I went ahead and changed the update statement and will test the timings between the temp table and table variable.
Thanks again
Susan
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply