June 15, 2014 at 8:14 pm
I am implementing Type 1 SCD - Update changed attributes and disregard older attributes.
Load new data to DB-->Populate the STG table(Derive a flag based on data) --> update the dbo.customerdetails table if the values are different for an existing customer
The update is taking for around 150 minutes to process 300,000 records to dbo.customerdetails table. i did tried disabling the indexes but it didnt help. Also i have attached the execution plan..
FYI. The dbo.customerdetails tables has around 3 million records and the processing does happen in batches.
UPDATE cd
SET cd.DetailID = T1.DetailID ,
cd.DetailValue = T1.DetailValue ,
cd.DetailTag = T1.DetailTag ,
cd.Batch = T1.Batch ,
cd.updateddate = 'Jun 16 2014 10:26AM'
FROM STG.CustomerDetails T1
INNER JOIN dbo.Customer C ON T1.RowID = c.RowID
AND T1.TypeID = c.TypeID
INNER JOIN dbo.CustomerDetails CD ON c.CustomerID = cd.CustomerID
AND ( cd.DetailID = T1.DetailID )
AND cd.DateTo IS NULL
WHERE T1.Batch = 11
AND ( ISNULL(cd.DetailID, 0) != ISNULL(T1.DetailID, 0)
OR ISNULL(cd.DetailValue, 0) != ISNULL(T1.DetailValue, 0)
OR ISNULL(cd.DetailTag, '') != ISNULL(T1.DetailTag, '')
)
Thanks,
Chinna
Its the Journey which gives you Happiness not the Destination-- Dan Millman
June 16, 2014 at 12:28 am
Are you updating a view?
There is a stream aggregate which suggests some grouping going on somewhere. You'll need to attach the actual plan as a *.sqlplan to properly interpret it.
June 16, 2014 at 12:33 am
There is a view which is built on dbo.customerdetail table.. Also attached the actual plan.
Thanks,
Chinna
Its the Journey which gives you Happiness not the Destination-- Dan Millman
June 16, 2014 at 3:07 am
Your query has an inner join between tables STG.personattribute (stg) and GDW.personattribute (tar)
which includes this predicate: Tar.AttributeID = stg.AttributeID.
The WHERE clause includes this predicate: ISNULL(Tar.AttributeID, 0) != ISNULL(stg.AttributeID, 0),
which fortunately is ORred or you wouldn't get any results at all. However, it's sufficient "red flag dodgy coding"
to put the query back into the test stream rather than the optimisation stream.
Other points to note:
The estimated plan is of limited use for optimisation studies.
Using functions - including ISNULL - on columns in joins or the WHERE clause is likely to impact performance.
Table GDW.Person is joined to both tar and stg. Without the rowcounts available from an "actual" execution plan
it's difficult to tell, but this could be a significant bottleneck.
The plan includes a key lookup to fetch AttributeValue and AttributeTag from tar. Consider adding these two columns
to the INCLUDE column clause of index UQ__PersonAttribute__4F1CD4FC.
The table which is the update target should ideally be the first table in the FROM list, like this:
UPDATE tar
SET AttributeID = stg.AttributeID ,
AttributeValue = stg.AttributeValue ,
AttributeTag = stg.AttributeTag ,
BatchFileID = stg.BatchFileID ,
DateUpdated = 'Jun 16 2014 10:26AM'
FROM GDW.personattribute tar
INNER JOIN STG.personattribute stg
ON tar.AttributeID = stg.AttributeID
AND tar.DateTo IS NULL
INNER JOIN GDW.Person p
ON p.SourceRowID = stg.SourceRowID
AND p.SourceID = stg.SourceID
and p.PersonID = tar.PersonID
WHERE stg.BatchFileID = 5502
AND ( ISNULL(Tar.AttributeID, 0) != ISNULL(stg.AttributeID, 0)
OR ISNULL(Tar.AttributeValue, 0) != ISNULL(stg.AttributeValue, 0)
OR ISNULL(Tar.AttributeTag, '') != ISNULL(stg.AttributeTag, '')
)
Try optimising this query in two phases, beginning with the SELECT equivalent of the UPDATE:
SELECT
tar.AttributeID, stg.AttributeID ,
tar.AttributeValue, stg.AttributeValue ,
tar.AttributeTag, stg.AttributeTag ,
tar.BatchFileID, stg.BatchFileID ,
tar.DateUpdated
FROM GDW.personattribute tar
INNER JOIN STG.personattribute stg
ON tar.AttributeID = stg.AttributeID
AND tar.DateTo IS NULL
INNER JOIN GDW.Person p
ON p.SourceRowID = stg.SourceRowID
AND p.SourceID = stg.SourceID
and p.PersonID = tar.PersonID
WHERE stg.BatchFileID = 5502
AND (
ISNULL(Tar.AttributeID, 0) != ISNULL(stg.AttributeID, 0) OR
ISNULL(Tar.AttributeValue, 0) != ISNULL(stg.AttributeValue, 0) OR
ISNULL(Tar.AttributeTag, '') != ISNULL(stg.AttributeTag, '')
)
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply