June 23, 2016 at 4:09 pm
Hello,
I am trying to merge a temp table (@TempTable) into a table(@MasterTable). I have 6 fields I need to compare if any of them have changed I need to update the table(@MasterTable).
Fields:
ID (I have a 1 to 1 database, so this ID will only exist once)
Event_Name
[EVENT]
Client_Name
Servicer_Code
Servicer_Name
MERGE INTO MasterTable as Target
USING (SELECT ID, Event_Name, [EVENT], Client_Name, Servicer_Code, Servicer_Name FROM @tempTable) as Source
ON (Target.ID = Source.ID)
WHEN Matched AND Target.Event_Name <> Source.Event_name OR Target.[EVENT] <> Source.[EVENT]
OR Target.Client_Name <> Source.Client_Name OR Target.Servicer_Code <> Source.Servicer_Code
OR Target.Servicer_Name <> Source.Servicer_Name
THEN
UPDATE SET Target.Event_Name = Source.Event_Name and Target.[EVENT] = Source.[EVENT]
AND Target.Client_Name = Source.Client_Name AND Target.Servicer_Code = Source.Servicer_Code
AND Target.Servicer_Name = Source.Servicer_Name
WHEN NOT MATCHED BY TARGET THEN
INSERT (ID, Event_Name, [EVENT],Client_Name, Servicer_Code, Servicer_Name)
values (Source.ID, Source.Event_Name, Source.[EVENT],Source.Client_Name, Source.Servicer_Code, Source.Servicer_Name)
If the ID matches then I need it to compare the other 5 fields, if any of them have changed, update the record. If ID does not exist, insert.
David92595
June 23, 2016 at 9:52 pm
Why do you need MERGE?
update T
set Event_Name = S.Event_Name ,
Client_Name = S.Client_Name,
[EVENT] = S.[EVENT],
Client_Name = S.Client_Name,
Servicer_Code = S.Servicer_Code,
Servicer_Name = S.Servicer_Name
FROM MasterTable T
INNER JOIN @tempTable S ON T.ID = S.ID
WHERE Target.Event_Name <> Source.Event_name
OR Target.[EVENT] <> Source.[EVENT]
OR Target.Client_Name <> Source.Client_Name
OR Target.Servicer_Code <> Source.Servicer_Code
OR Target.Servicer_Name <> Source.Servicer_Name
INSERT INTO MasterTable (Event_Name, [EVENT], Client_Name, Servicer_Code, Servicer_Name)
SELECT Event_Name, [EVENT], Client_Name, Servicer_Code, Servicer_Name
FROM @tempTable S
WHERE NOT EXISTS (select * from MasterTable T
WHERE T.ID = S.ID )
_____________
Code for TallyGenerator
June 24, 2016 at 2:24 am
Wow, way over thought that one...
Thank you!
June 24, 2016 at 4:18 am
Sergiy (6/23/2016)
Why do you need MERGE?
Don't know, maybe because it is easier to read (once you've gotten used to the merge syntax), does the same thing in a single statement and performs better? 😉
WITH cteSource as (
SELECT ID, Event_Name, [EVENT], Client_Name, Servicer_Code, Servicer_Name
FROM @tempTable
),
cteTarget as (
SELECT ID, Event_Name, [EVENT], Client_Name, Servicer_Code, Servicer_Name
FROM MasterTable mt
WHERE EXISTS (
SELECT *
FROM @temptable t
WHERE t.ID = mt.ID
)
)
MERGE INTO cteTarget t
USING cteSource s
ON (s.ID = t.ID)
WHEN Matched AND (
t.Event_Name <> s.Event_name
OR t.[EVENT] <> s.[EVENT]
OR t.Client_Name <> s.Client_Name
OR t.Servicer_Code <> s.Servicer_Code
OR t.Servicer_Name <> s.Servicer_Name
)
THEN
UPDATE
SET
t.Event_Name = s.Event_Name,
t.[EVENT] = s.[EVENT],
t.Client_Name = s.Client_Name,
t.Servicer_Code = s.Servicer_Code,
t.Servicer_Name = s.Servicer_Name
WHEN NOT MATCHED BY TARGET
THEN
INSERT (ID, Event_Name, [EVENT],Client_Name, Servicer_Code, Servicer_Name)
VALUES (s.ID, s.Event_Name, s.[EVENT],s.Client_Name, s.Servicer_Code, s.Servicer_Name);
June 24, 2016 at 6:26 am
UPDATE is probably better than MERGE.
If you are a fan of MERGE, please take a read of this[/url] and decide whether, perhaps, you should reconsider.
And rather than using this type of construction:
UPDATE
...
WHERE
a.1 <> b.1 AND
a.2 <> b.2
I would suggest using
UPDATE
WHERE NOT EXISTS (
SELECT a.1, a.2
INTERSECT
SELECT b.1, b.2
)
because the latter version handles NULLs.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
June 24, 2016 at 8:10 am
Phil Parkin (6/24/2016)
UPDATE is probably better than MERGE.If you are a fan of MERGE, please take a read of this[/url] and decide whether, perhaps, you should reconsider.
And rather than using this type of construction:
UPDATE
...
WHERE
a.1 <> b.1 AND
a.2 <> b.2
I would suggest using
UPDATE
WHERE NOT EXISTS (
SELECT a.1, a.2
INTERSECT
SELECT b.1, b.2
)
because the latter version handles NULLs.
I prefer using EXCEPT, simply because it closer to the way I, and probably most other people, think about this.
;
WITH Source AS (
SELECT ID, Event_Name, [EVENT], Client_Name, Servicer_Code, Servicer_Name
FROM @tempTable
EXCEPT
SELECT ID, Event_Name, [EVENT], Client_Name, Servicer_Code, Servicer_Name
FROM MasterTable
)
<your merge or update statement here>
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
June 24, 2016 at 8:25 am
Paul White suggests that the NOT EXISTS ... INTERSECT form produces better query plans than the EXCEPT form, and that's why I use it. See here (in the comments).
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
June 24, 2016 at 10:18 am
Thanks guys
As a work around I was comparing one field at a time. This knocked 200 lines of code out of my report and cut execution time in half!
June 24, 2016 at 12:18 pm
Phil Parkin (6/24/2016)
Paul White suggests that the NOT EXISTS ... INTERSECT form produces better query plans than the EXCEPT form, and that's why I use it. See here (in the comments).
Further down in those same comments he says that you can use EXCEPT when you are not interested in both sides, which is the case here. We don't care about records that exist in the target, but not in the source in this particular case. The NOT EXISTS ... INTERSECT version will produce those records (depending on the initial join type), whereas the EXCEPT version will not.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
June 26, 2016 at 9:41 pm
Phil Parkin (6/24/2016)
Paul White suggests that the NOT EXISTS ... INTERSECT form produces better query plans than the EXCEPT form, and that's why I use it. See here (in the comments).
No, it does not.
Before repeating after Paul White try to reproduce it yourself.
And don't be afraid to deviate a bit from the scripts provided by Paul.
The problem with Paul's suggestions is that he does not stop before skewing his test scripts to make the output fitting his agenda.
In the blog post you quoted he's provided all sorts of "alternative to INTERSECT" forms of queries, except one.
The one which is most obvious and is most commonly used for this kind of checks:
SELECT
*
FROM @Set1 AS t
JOIN @Set2 AS s ON
t.pk = s.pk
WHERE
NOT EXISTS
(
SELECT 1
WHERE
t.pk = s.pk
AND (t.ival = s.ival OR (t.ival IS NULL AND s.ival IS NULL))
AND (t.cval = s.cval OR (t.cval IS NULL AND s.cval IS NULL))
AND (t.mval = s.mval OR (t.mval IS NULL AND s.mval IS NULL))
)
Can you guess why?
Can you suspect he does not (did not at the time) know about this form of query?
Oh, no, you would not discredit him as an SQL guru. He certainly knows that.
But with this query in the set the whole point of using INTERSECT would disappear: it produces exactly the same plan, exactly the same output and, of course, exactly the same execution statistics.
So, NOT EXISTS ... INTERSECT is not any better than NOT EXISTS (... Equal... )
Apart from probably the shorter code.
But even the code would be of the same length if we'd set ANSI NULLS OFF:
SET ANSI_NULLS OFF
SELECT
*
FROM @Set1 AS t
JOIN @Set2 AS s ON
t.pk = s.pk
WHERE
NOT EXISTS
(
SELECT 1
WHERE
t.pk = s.pk
AND (t.ival = s.ival)
AND (t.cval = s.cval)
AND (t.mval = s.mval)
)
Which we cannot do because this setting is being deprecated and to be strictly blocked in newer versions of SQL Server.
So, the only thing which is different about INTERSECT that in an environment where ANSI NULLS standard is enforced it provides a "back door" for those developers who strongly believe that NULL=NULL.
Well, some would consider it as an advantage, I would not do it.
That's about "JOIN" kind of queries Paul was using in his test scripts.
But it applies strictly to "update" queries.
_____________
Code for TallyGenerator
June 26, 2016 at 9:58 pm
But what if we need to identify possibly new records added to the set?
Then we need to use a bit different form of the queries:
SET STATISTICS IO, TIME, PROFILE ON
SELECT *
FROM @Set1 AS t
WHERE
NOT EXISTS (SELECT s.* FROM @Set2 AS s INTERSECT SELECT t.*)
SELECT *
FROM @Set1 AS t
WHERE
NOT EXISTS
(
SELECT 1
FROM @Set2 s
WHERE
t.pk = s.pk
AND (t.ival = s.ival OR (t.ival IS NULL AND s.ival IS NULL))
AND (t.cval = s.cval OR (t.cval IS NULL AND s.cval IS NULL))
AND (t.mval = s.mval OR (t.mval IS NULL AND s.mval IS NULL))
)
SELECT *
FROM @Set1 AS t
EXCEPT
SELECT *
FROM @Set2 s
SET STATISTICS IO, TIME, PROFILE OFF
If you run it you'll see:
- all 3 queries generate exactly the same output;
- all 3 queries result in exactly the same IO stats;
- time stats are different from time to time, and there is no consistent winner or loser amongst these 3 queries;
- execution plans are almost identical for all 3 queries, except INTERSECT query has an extra TOP 1 operation in it. It may be still below 1% overhead, but anyway - it's there. An overhead.
RowsExecutesStmtTextStmtIdNodeIdParentPhysicalOpLogicalOpArgumentDefinedValuesEstimateRowsEstimateIOEstimateCPUAvgRowSizeTotalSubtreeCostOutputListWarningsTypeParallelEstimateExecutions
41SELECT *
FROM @Set1 AS t
WHERE
NOT EXISTS (SELECT s.* FROM @Set2 AS s INTERSECT SELECT t.*)1110NULLNULLNULLNULL1NULLNULLNULL0.00657166NULLNULLSELECT0NULL
41 |--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([t].[pk], [t].[ival], [t].[cval], [t].[mval]))1121Nested LoopsLeft Anti Semi JoinOUTER REFERENCES:([t].[pk], [t].[ival], [t].[cval], [t].[mval])NULL104.18E-06280.00657166[t].[pk], [t].[ival], [t].[cval], [t].[mval]NULLPLAN_ROW01
61 |--Index Scan(OBJECT:(@Set1 AS [t]))1132Index ScanIndex ScanOBJECT:(@Set1 AS [t])[t].[pk], [t].[ival], [t].[cval], [t].[mval]10.0031250.0001581280.0032831[t].[pk], [t].[ival], [t].[cval], [t].[mval]NULLPLAN_ROW01
26 |--Top(TOP EXPRESSION:((1)))1142TopTopTOP EXPRESSION:((1))NULL101E-0790.00328438NULLNULLPLAN_ROW01
26 |--Clustered Index Seek(OBJECT:(@Set2 AS ), SEEK:(.[pk]=@Set1.[pk] as [t].[pk]), WHERE:(@Set2.[ival] as .[ival] = @Set1.[ival] as [t].[ival] AND @Set2.[mval] as .[mval] = @Set1.[mval] as [t].[mval] AND @Set2.[cval] as .[cval] = @Set1.[cval] as [t].[cval]) ORDERED FORWARD)1154Clustered Index SeekClustered Index SeekOBJECT:(@Set2 AS ), SEEK:(.[pk]=@Set1.[pk] as [t].[pk]), WHERE:(@Set2.[ival] as .[ival] = @Set1.[ival] as [t].[ival] AND @Set2.[mval] as .[mval] = @Set1.[mval] as [t].[mval] AND @Set2.[cval] as .[cval] = @Set1.[cval] as [t].[cval]) ORDERED FORWARDNULL10.0031250.0001581200.0032831NULLNULLPLAN_ROW01
RowsExecutesStmtTextStmtIdNodeIdParentPhysicalOpLogicalOpArgumentDefinedValuesEstimateRowsEstimateIOEstimateCPUAvgRowSizeTotalSubtreeCostOutputListWarningsTypeParallelEstimateExecutions
41SELECT *
FROM @Set1 AS t
WHERE
NOT EXISTS
(
SELECT 1
FROM @Set2 s
WHERE
t.pk = s.pk
AND (t.ival = s.ival OR (t.ival IS NULL AND s.ival IS NULL))
AND (t.cval = s.cval OR (t.cval IS NULL AND s.cval IS NULL))
AND (t.mval = s.mval OR (t.mval IS NULL AND s.mval IS NULL))
)1310NULLNULLNULLNULL1NULLNULLNULL0.00657156NULLNULLSELECT0NULL
41 |--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([t].[pk], [t].[ival], [t].[cval], [t].[mval]))1321Nested LoopsLeft Anti Semi JoinOUTER REFERENCES:([t].[pk], [t].[ival], [t].[cval], [t].[mval])NULL104.18E-06280.00657156[t].[pk], [t].[ival], [t].[cval], [t].[mval]NULLPLAN_ROW01
61 |--Index Scan(OBJECT:(@Set1 AS [t]))1332Index ScanIndex ScanOBJECT:(@Set1 AS [t])[t].[pk], [t].[ival], [t].[cval], [t].[mval]10.0031250.0001581280.0032831[t].[pk], [t].[ival], [t].[cval], [t].[mval]NULLPLAN_ROW01
26 |--Clustered Index Seek(OBJECT:(@Set2 AS ), SEEK:(.[pk]=@Set1.[pk] as [t].[pk]), WHERE:(@Set1.[ival] as [t].[ival] = @Set2.[ival] as .[ival] AND @Set1.[mval] as [t].[mval] = @Set2.[mval] as .[mval] AND @Set1.[cval] as [t].[cval] = @Set2.[cval] as .[cval]) ORDERED FORWARD)1342Clustered Index SeekClustered Index SeekOBJECT:(@Set2 AS ), SEEK:(.[pk]=@Set1.[pk] as [t].[pk]), WHERE:(@Set1.[ival] as [t].[ival] = @Set2.[ival] as .[ival] AND @Set1.[mval] as [t].[mval] = @Set2.[mval] as .[mval] AND @Set1.[cval] as [t].[cval] = @Set2.[cval] as .[cval]) ORDERED FORWARDNULL10.0031250.0001581200.0032831NULLNULLPLAN_ROW01
RowsExecutesStmtTextStmtIdNodeIdParentPhysicalOpLogicalOpArgumentDefinedValuesEstimateRowsEstimateIOEstimateCPUAvgRowSizeTotalSubtreeCostOutputListWarningsTypeParallelEstimateExecutions
41SELECT *
FROM @Set1 AS t
EXCEPT
SELECT *
FROM @Set2 s1510NULLNULLNULLNULL1NULLNULLNULL0.00657156NULLNULLSELECT0NULL
41 |--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([t].[pk], [t].[ival], [t].[cval], [t].[mval]))1521Nested LoopsLeft Anti Semi JoinOUTER REFERENCES:([t].[pk], [t].[ival], [t].[cval], [t].[mval])NULL104.18E-06280.00657156[t].[pk], [t].[ival], [t].[cval], [t].[mval]NULLPLAN_ROW01
61 |--Index Scan(OBJECT:(@Set1 AS [t]))1532Index ScanIndex ScanOBJECT:(@Set1 AS [t])[t].[pk], [t].[ival], [t].[cval], [t].[mval]10.0031250.0001581280.0032831[t].[pk], [t].[ival], [t].[cval], [t].[mval]NULLPLAN_ROW01
26 |--Clustered Index Seek(OBJECT:(@Set2 AS ), SEEK:(.[pk]=@Set1.[pk] as [t].[pk]), WHERE:(@Set1.[ival] as [t].[ival] = @Set2.[ival] as .[ival] AND @Set1.[mval] as [t].[mval] = @Set2.[mval] as .[mval] AND @Set1.[cval] as [t].[cval] = @Set2.[cval] as .[cval]) ORDERED FORWARD)1542Clustered Index SeekClustered Index SeekOBJECT:(@Set2 AS ), SEEK:(.[pk]=@Set1.[pk] as [t].[pk]), WHERE:(@Set1.[ival] as [t].[ival] = @Set2.[ival] as .[ival] AND @Set1.[mval] as [t].[mval] = @Set2.[mval] as .[mval] AND @Set1.[cval] as [t].[cval] = @Set2.[cval] as .[cval]) ORDERED FORWARDNULL10.0031250.0001581200.0032831NULLNULLPLAN_ROW01
I also tested with several extra rows added to table @Set1:
INSERT @Set1
(pk, ival, cval, mval)
VALUES
(1, 1000, 'a', $1),
(2, NULL, 'b', $2),
(3, 3000, 'c', NULL),
(4, NULL, 'd', $4),
(5, 5000, 'e', $5),
(6, 6000, 'e', $6),
(7, 7000, 'e', $7),
(8, 8000, 'e', $8),
(9, 9000, 'e', $9),
(10, 10000, 'e', $10),
(11, 11000, 'e', $11),
(12, 12000, 'e', $12);
;
Conclusion remains the same:
1. NOT EXISTS ... INTERSECT does not provide any advantage comparing to NOT EXISTS ...EQUAL or EXCEPT forms of queries;
2. It adds a small overhead of an additional TOP 1 operation when looking for newly added rows.
Which is a bit different from what Paul has come up with.
You may with to run those tests for yourself.
See which conclusions you'll support afterwards.
_____________
Code for TallyGenerator
June 28, 2016 at 10:36 am
Sergiy, thank you for taking the time to write these informative posts.
I admit that I am guilty of following Paul White's recommendations as would a sheep and now I shall be looking at things more closely, next time I have to develop code that does this sort of thing.
I still prefer the EXCEPT or INTERSECT versions, because of their inherent handling of NULLs & agree that the EXCEPT form is possibly more pleasing on the brain than the NOT EXISTS ... INTERSECT form.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
June 28, 2016 at 9:13 pm
Phil Parkin (6/28/2016)
I still prefer the EXCEPT or INTERSECT versions, because of their inherent handling of NULLs & agree that the EXCEPT form is possibly more pleasing on the brain than the NOT EXISTS ... INTERSECT form.
That's your personal preferences, and you have every right to have it and follow it (unless you're in North Korea :-))
Functionally all 3 methods are the same, so it's other factors which play role in choosing one method over another.
Like with CTE - I prefer to use derived tables, some prefer CTE's, and neither side has a case to prove which method is better.
_____________
Code for TallyGenerator
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply