June 7, 2013 at 5:58 am
Hi Guys,
Any Ideas what the best way would be to perform this update?
UPDATE
D
SETTrack_ID= P.Track_ID,
Territory_ID= P.Territory_ID,
Major= P.Major
FROMStaging.Track_MappingP
INNER JOINStaging.Track_PlayD
ONP.ISRC= D.ISRC_Code
ANDP.Event_Date= D.Event_Date
Both tables have around 63 million rows and have numerous indexes.
Any ideas appreciated
Thanks,
Rich
June 7, 2013 at 6:06 am
breaking it into small batches may help you.
June 7, 2013 at 6:36 am
Do you really want to update all the 63 million rows?
Check if you can put some filter to reduce the number of rows that will be affected.
How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/
June 7, 2013 at 7:25 am
Something like this can help, and it's easy to check;
UPDATE D SET
Track_ID = P.Track_ID,
Territory_ID = P.Territory_ID,
Major = P.Major
FROM Staging.Track_Play D
INNER JOIN Staging.Track_Mapping P
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
SELECT
Track_ID = MAX(P.Track_ID),
Territory_ID = MAX(P.Territory_ID),
Major = MAX(P.Major),
D.ISRC_Code,
D.Event_Date
INTO #Updater
FROM Staging.Track_Play D
INNER JOIN Staging.Track_Mapping P
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
GROUP BY D.ISRC_Code, D.Event_Date
CREATE UNIQUE CLUSTERED INDEX cx_ISRC_Code_Event_Date ON #Updater (ISRC_Code,Event_Date)
UPDATE D SET
Track_ID = P.Track_ID,
Territory_ID = P.Territory_ID,
Major = P.Major
FROM Staging.Track_Play D
INNER JOIN #Updater P
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 7, 2013 at 2:55 pm
ChrisM@Work (6/7/2013)
Something like this can help, and it's easy to check;
UPDATE D SET
Track_ID = P.Track_ID,
Territory_ID = P.Territory_ID,
Major = P.Major
FROM Staging.Track_Play D
INNER JOIN Staging.Track_Mapping P
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
SELECT
Track_ID = MAX(P.Track_ID),
Territory_ID = MAX(P.Territory_ID),
Major = MAX(P.Major),
D.ISRC_Code,
D.Event_Date
INTO #Updater
FROM Staging.Track_Play D
INNER JOIN Staging.Track_Mapping P
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
GROUP BY D.ISRC_Code, D.Event_Date
CREATE UNIQUE CLUSTERED INDEX cx_ISRC_Code_Event_Date ON #Updater (ISRC_Code,Event_Date)
UPDATE D SET
Track_ID = P.Track_ID,
Territory_ID = P.Territory_ID,
Major = P.Major
FROM Staging.Track_Play D
INNER JOIN #Updater P
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
I see a lot of extra and expensive operations... But where is a profit?
June 10, 2013 at 1:20 am
Alexander Suprun (6/7/2013)
ChrisM@Work (6/7/2013)
Something like this can help, and it's easy to check;
UPDATE D SET
Track_ID = P.Track_ID,
Territory_ID = P.Territory_ID,
Major = P.Major
FROM Staging.Track_Play D
INNER JOIN Staging.Track_Mapping P
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
SELECT
Track_ID = MAX(P.Track_ID),
Territory_ID = MAX(P.Territory_ID),
Major = MAX(P.Major),
D.ISRC_Code,
D.Event_Date
INTO #Updater
FROM Staging.Track_Play D
INNER JOIN Staging.Track_Mapping P
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
GROUP BY D.ISRC_Code, D.Event_Date
CREATE UNIQUE CLUSTERED INDEX cx_ISRC_Code_Event_Date ON #Updater (ISRC_Code,Event_Date)
UPDATE D SET
Track_ID = P.Track_ID,
Territory_ID = P.Territory_ID,
Major = P.Major
FROM Staging.Track_Play D
INNER JOIN #Updater P
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
I see a lot of extra and expensive operations... But where is a profit?
Without an execution plan to examine I can't tell if there are any extra and expensive operations, neither can I determine if the mod I've suggested may be of any benefit. Sometimes it is.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 10, 2013 at 2:25 am
Kingston Dhasian (6/7/2013)
Do you really want to update all the 63 million rows?Check if you can put some filter to reduce the number of rows that will be affected.
I agree with Kingston here, try and minimise the data set that you are updating so that you only update those rows that have changed, something like
UPDATE
D
SET
Track_ID = P.Track_ID,
Territory_ID = P.Territory_ID,
Major = P.Major
FROM
Staging.Track_Mapping P
INNER JOIN Staging.Track_Play D
ON P.ISRC = D.ISRC_Code
AND P.Event_Date = D.Event_Date
WHERE
D.Track_ID <> P.Track_ID
OR D.Territory_ID <> P.Territory_ID
OR D.Major <> P.Major
If you have to update all 63 million rows on the table, is there a reason why you cant populate the fields when you are loading the data into the table in the first place, a simple SSIS task with a lookup should allow you to add the data in or a statement with the INNER JOIN between the SOURCE feed and the Track_Mapping table should work.
_________________________________________________________________________
SSC Guide to Posting and Best Practices
June 10, 2013 at 5:54 am
Thanks for all the help everyone. I have now found out what the user was actually trying to do and this update was to every row in the table, so I'm no longer doing and Update and using a insert into a separate table then truncating and inserting into the existing table instead.
Thanks Again.
June 11, 2013 at 7:35 am
Richie T (6/7/2013)
Hi Guys,Any Ideas what the best way would be to perform this update?
UPDATE
D
SETTrack_ID= P.Track_ID,
Territory_ID= P.Territory_ID,
Major= P.Major
FROMStaging.Track_MappingP
INNER JOINStaging.Track_PlayD
ONP.ISRC= D.ISRC_Code
ANDP.Event_Date= D.Event_Date
Both tables have around 63 million rows and have numerous indexes.
Any ideas appreciated
Thanks,
Rich
Numerous indexes are counter productive when it comes to performance of UPDATEs, because the more indexes there are, then the more data pages have to be updated and transaction logged. To facilitate this specific UPDATE, what you need is an index on ISRC_Code and Event_Date.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply