October 18, 2018 at 9:06 pm
allan.ford17 - Thursday, October 18, 2018 6:11 PMI wanted a process that does absolute minimal DB transactions ... i.e. there may be a table row level trigger than logs transactions for other integration purposes. There may also be child records in child and grandchildren tables ...
Then this should work for you.
1. Delete obsolete records:DELETE T2
FROM dbo.Table_2 T2
WHERE NOT EXISTS (select * from dbo.Table_1 T1
where T2.ID_COL = T1.ID_COL)
2. Update records with ID's present in both tables and having some differences:UPDATE T2
SET NAME_VAL1 = T2.NAME_VAL1,
DATE_VAL1 = T2.DATE_VAL1,
NUMBER_VAL1 = T2.NUMBER_VAL1
FROM dbo.Table_2 T2
INNER JOIN dbo.Table_1 T1 ON T2.ID_COL = T1.ID_COL
WHERE not (
(T1.NAME_VAL1 = T2.NAME_VAL1 or (T1.NAME_VAL1 is null and T2.NAME_VAL1 is null ))
and (T1.DATE_VAL1 = T2.DATE_VAL1 or (T1.DATE_VAL1 is null and T2.DATE_VAL1 is null ))
and (T1.NUMBER_VAL1 = T2.NUMBER_VAL1 or (T1.NUMBER_VAL1 is null and T2.NUMBER_VAL1 is null ))
)
This script will cause clustered index scan, so it's not too effective.
But it cannot be improved with existing indexing.
With additional indexing in place some more careful strategies may be chosen.
But again - it pays only on big data sets.
3. Add new records:Insert dbo.Table_2
(ID_COL,NAME_VAL1,DATE_VAL1,NUMBER_VAL1)
SELECT ID_COL,NAME_VAL1,DATE_VAL1,NUMBER_VAL1
FROM dbo.Table_1 T1
WHERE NOT EXISTS (select * from dbo.Table_2 T2
where T2.ID_COL = T1.ID_COL)
When I measured performance with SET STATISTICS TIME ON the speed difference was > 100 times.
But it's not an accurate measure, because statistics collected on every statement in every loop take time for themselves.
And on such a small data set my 3 statements take < 1 mcs to complete
You must have bigger sets - try to execute it like this:declare @TimeStamp datetime2
select * from dbo.Table_1;
select * from dbo.Table_2;
select @TimeStamp = SYSDATETIME()
DECLARE @return_value int
EXEC @return_value = [dbo].[TablesSyncByCompare]
select [Execution time - mcs] = datediff(mcs, @TimeStamp, sysdatetime() )
SELECT 'Return Value' = @return_value
and compare to this:
declare @TimeStamp datetime2
select * from dbo.Table_1;
select * from dbo.Table_2;
select @TimeStamp = SYSDATETIME()
DELETE T2
...
UPDATE T2
....
Insert dbo.Table_2
....
select [Execution time - mcs] = datediff(mcs, @TimeStamp, sysdatetime() )
_____________
Code for TallyGenerator
October 19, 2018 at 1:22 am
Thanks Sergiy !!
I will give this a try too !
Good approach .. Could adapt for multi column primary key also .. (e.g. concat the key values into a single value perhaps ..)
One reason for wanting to process RBAR (Row by agonising row) might be to handle / capture errors that occur.
e.g. bad data ... data that doesn't meet foreign key constraints or other constraints ..
May want to capture the bad data but process as much good data as available.
In some cases it is OK to assume all the data is good and go with a process that does not have error handling ..
cheers, thanks,
Allan.
October 19, 2018 at 6:22 am
allan.ford17 - Friday, October 19, 2018 1:22 AMThanks Sergiy !!One reason for wanting to process RBAR (Row by agonising row) might be to handle / capture errors that occur.
e.g. bad data ... data that doesn't meet foreign key constraints or other constraints ..
May want to capture the bad data but process as much good data as available.
In some cases it is OK to assume all the data is good and go with a process that does not have error handling ..
Programming by exception is not the greatest idea for database environments.
There is no user to action on an exception, so you must pre-program all the actions in all possible scenarios.
And if you know all the error scenarios and all corresponding actions - there is no need to wait for exceptions.
Check for bad data before uploading and exclude it from the set (reporting all the violation reasons on the way).
OUTPUT clause makes the task really easy.
DELETE T2
OUTPUT {list of columns}, 'Violation of FK to [Look-up Table]' Reason
INTO [Audit Table]
FROM dbo.Table_2 T2
WHERE NOT EXISTS (select * from [Look-up table] LT
where Lt.PK_Col = T2.FK_Col
)
Proceed with upload only after the source data have been cleaned up.
_____________
Code for TallyGenerator
Viewing 3 posts - 16 through 17 (of 17 total)
You must be logged in to reply to this topic. Login to reply