October 6, 2010 at 3:29 am
Hi All:-
Please look for the following scenario:-
Tablename =trans_header
total of record=29,169,120
1)
UPDATE trans_header
SET CARD_NO = left(card_no,6)+'*****'+right(card_no,4)
WHERE country_code='128'
and len(card_no)=15
GO
Status:-
(366,699 row(s) affected)
Time taken : 40 minutes
2)
UPDATE trans_header
SET CARD_NO = left(card_no,6)+'******'+right(card_no,4)
WHERE country_code='128'
and len(card_no)=16
GO
Status:-
after 17 hours still running
Question to ask:-
Actually both query is reading the same table, why the second update query will still running. We notice the status is like on/off runnable and sleeping. I was worry it. Anyone have any idea of this case.
Hope can get you all reply. Your advice is very important to us.
Thanks in millions
October 6, 2010 at 3:43 am
Have you confirmed that the update is not being blocked ?
I would imagine both of those queries will force a table scan anyway.
Any triggers / referential integrity to be checked.
If you select with the same condition on the second update , how long does it take to execute ?
Dont discount the fact that you will be updating many many more rows.
Most credit cards are 16 digits meaning you will be updating a lot more rows.
October 6, 2010 at 4:08 am
Have you confirmed that the update is not being blocked ?
1) The update is not being blocked
2) The size of the .LDF file is increasing, but it is slow eg.
5.15GB at 4.50pm
5.30GB at 5.02pm
5.60GB at 5.52pm
read/write it will make this file increase, this show that the update is running.
3) Yes it is a huge records need to be updated.
Questions:-
1) will it be the CPU or memory usage problem?
2) I was worry in the end will get the I/O error or other error, since it has been running already 17 hours
October 6, 2010 at 4:35 am
Its sounds as if the bottle neck is writing to the logfile.
What are your autogrowth settings ? If this is to small it will be a overhead having to continually extend it.
Have you considered doing this in smaller batches ?
while(0=0) begin
UPDATE top(100000)
trans_header
SET CARD_NO = left(card_no,6)+'******'+right(card_no,4)
WHERE country_code='128'
and len(card_no)=16
and card_no not like '%****%'
if(@@Rowcount<>100000) begin
break
end
end
October 6, 2010 at 4:35 am
With this number of rows to update, batching becomes attractive. Here's one way to do it, there are many.
-- Assumes trans_headerID is pk for table
UPDATE t SET CARD_NO = left(card_no,6)+'******'+right(card_no,4)
FROM trans_header t
INNER JOIN (
SELECT TOP 100000 trans_headerID
FROM trans_header
WHERE country_code = '128'
AND LEN(card_no) = 16
AND NOT card_no LIKE '%******%'
) d ON d.trans_headerID = t.trans_headerID
Edit: Oops sorry Dave, treading on your feet. Your autogrowth suggestion seems most likely.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 6, 2010 at 4:52 am
Chris Morris-439714 (10/6/2010)
Edit: Oops sorry Dave, treading on your feet. Your autogrowth suggestion seems most likely.
🙂
No worries , but i did make an error in my update statement , didnt put in the 'NOT LIKE '%***%' ' condition
October 6, 2010 at 5:56 am
1) autogrowth settings is 1MB, can we change it while update is still running? what is the best size setting for this autogrowth.
2) It is a good idea to run by batches as the sample you gaven. But we are running it now, if we kill it will cause the rollback start running and we need to wait for it.
Thanks!
October 6, 2010 at 6:37 am
1) "Best" is a tricky one , it depends. BUT to force through this query ,set it to 1GB. You should be able to change that on the fly without any issues.
2) Yes it will rollback the changes already made. Using a 'Select with nolock ' you should be able to tell how far through it is , this may not work with various isolation levels.
Try option 1 first.
October 6, 2010 at 7:38 pm
-- Assumes trans_headerID is pk for table
UPDATE t SET CARD_NO = left(card_no,6)+'******'+right(card_no,4)
FROM trans_header t
INNER JOIN (
SELECT TOP 100000 trans_headerID
FROM trans_header
WHERE country_code = '128'
AND LEN(card_no) = 16
AND NOT card_no LIKE '%******%'
) d ON d.trans_headerID = t.trans_headerID
Thanks for this query.
Question:-
If select the top(100000) records to update, will it continue to update the next top(100000)? if the last batch only contain 400 records, based on this query will it still process the last batch?
Thank you very much!
October 6, 2010 at 7:43 pm
1) I have tried the option 1, set it to 1000MB. it looks like still the same.
2) I have found out the a wait type - CXPACKET (0422). May I know what is this regarding? is it something is running?
Thank you!
October 7, 2010 at 2:59 am
christine.koh (10/6/2010)
If select the top(100000) records to update, will it continue to update the next top(100000)?
No, you would have to run it again. That's the point.
christine.koh (10/6/2010)
if the last batch only contain 400 records, based on this query will it still process the last batch?
Yes. If matching rows are returned by the subquery [SELECT TOP 100000 trans_headerID
FROM trans_header
WHERE country_code = '128'
AND LEN(card_no) = 16
AND NOT card_no LIKE '%******%'], they will be processed.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply