May 16, 2013 at 2:25 pm
Hello - I have a table with 6 million rows that I need to update.
These rows will be updated based on a join to a temp table that has the needed update values.
This will be a "real-time" update so the record updates need to be done in chunks to avoid locking.
Someone I know with a deeper SQL background than myself suggested the following structure for performing the update:
WHILE @@ROWCOUNT > 0
BEGIN
update TOP(5000) ceb
set ceb.new_WCDSCreatedOn = i.CreateDate
OUTPUT inserted.New_IndividualId
INTO @Updates
from contactextensionbase ceb
join #tmpIndividualCreateDates i on ceb.New_IndividualId = i.IndividualID
WHERE NOT EXISTS (
SELECT *
FROM @Updates AS Updates
WHERE Updates.New_IndividualId = ceb.New_IndividualId
)
END
I think this SQL is fairly elegant but it doesn't seem to work the way we expect. For example, I executed the SQL and it ran for 2.5 hours without completing. It probably shouldn't take that long to update 6M rows.
So have you executed an update statement like this before? Does this structure look correct or can you identify any flaws that seem to stand out?
May 16, 2013 at 2:27 pm
Probably the table variable. Up to 6 million rows in an unindexed table variable without stats = bad idea. Try a temp table or permanent table with a useful index.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 16, 2013 at 3:52 pm
First, you need an index on ceb.New_IndividualId: I'll assume you have that already.
Second, I'd cluster the #tmpIndividualCreateDates table by i.IndividualID: I'll assume you've done that already.
I guess the OUTPUT clause to INSERT the rows UPDATEd into a new table and then do a NOT EXISTS against them is to prevent re-work ... but in this case, I suspect that's more overhead than the re-work itself would be anyway. I think we need to cut all that overhead.
Besides, if we process only forward thru the temp table, we shouldn't have that issue, barring errors / restarts.
At any rate, I suggest something like this:
DECLARE @IndividualID_start int
DECLARE @IndividualID_end int
DECLARE @update_count int
SELECT @IndividualID_start = -1
SET @update_count = 1
WHILE @update_count > 0
BEGIN
-- find the 5000th IndividualID value past the last one processed
SELECT TOP (5000) @IndividualID_end = IndividualID
FROM #tmpIndividualCreateDates
WHERE
IndividualID > @IndividualID_start
ORDER BY
IndividualID
UPDATE ceb
SET
ceb.new_WCDSCreatedOn = i.CreateDate
FROM dbo.contactextensionbase ceb
INNER JOIN #tmpIndividualCreateDates i ON
ceb.New_IndividualId = i.IndividualID AND
ceb.New_IndividualId > @IndividualID_start AND
ceb.New_IndividualId <= @IndividualID_end AND
i.IndividualID > @IndividualID_start AND
i.IndividualID <= @IndividualID_end
SET @update_count = @@ROWCOUNT
SET @IndividualID_start = @IndividualID_end
END
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 16, 2013 at 11:34 pm
DECLARE @Blocksize AS BIGINT = 5000,
@pagenum AS BIGINT = 0,
@ROWCNT AS BIGINT = 1;
While @ROWCNT > 0
BEGIN
Update ceb set CEB.new_WCDSCreatedOn = D.CreateDate
from contactextensionbase ceb join
(SELECT IndividualID,CreateDate
from #tmpIndividualCreateDates
order by IndividualID
OFFSET @Blocksize * @pagenum ROWS FETCH next @Blocksize ROWS ONLY) d
on ceb.New_IndividualId = d.IndividualID
set @ROWCNT = @@ROWCOUNT
SET @pagenum = @pagenum + 1
END
May 17, 2013 at 2:58 am
lgegerton (5/16/2013)
DECLARE @Blocksize AS BIGINT = 5000,@pagenum AS BIGINT = 0,
@ROWCNT AS BIGINT = 1;
While @ROWCNT > 0
BEGIN
Update ceb set CEB.new_WCDSCreatedOn = D.CreateDate
from contactextensionbase ceb join
(SELECT IndividualID,CreateDate
from #tmpIndividualCreateDates
order by IndividualID
OFFSET @Blocksize * @pagenum ROWS FETCH next @Blocksize ROWS ONLY) d
on ceb.New_IndividualId = d.IndividualID
set @ROWCNT = @@ROWCOUNT
SET @pagenum = @pagenum + 1
END
This won't work with SQL Server 2008.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply