March 24, 2014 at 10:00 pm
Comments posted to this topic are about the item Using EXCEPT to Determine Row Differences
March 25, 2014 at 2:35 am
Something to note about EXCEPT is that it is a DISTINCT Query and does not compare duplicate rows. Any suggestions on how you would get around this?
March 25, 2014 at 3:39 am
Where I have been using this is to keep read-only data synchronized between two areas by their primary key, so I do not currently have a scenario like this.
If I was keeping a full history of everything synchronized to date that this might be similar to what you describe.
Could you explain further?
March 25, 2014 at 3:50 am
hello,
The join between #source and #Target in the SELECT .... EXCEPT is not essential
You can obtain the same result with a best cost
SELECT .[ID]
, .[Item]
, .[Price]
, .[OrderDate]
, .[Units]
, .[ShipmentDate]
FROM [Source]
EXCEPT
SELECT [T].[ID]
, [T].[Item]
, [T].[Price]
, [T].[OrderDate]
, [T].[Units]
, [T].[ShipmentDate]
FROM [Target] [T]
;
Am I wrong?
Arnaud
March 25, 2014 at 3:55 am
Lets say we have a reconciliation against a 3rd party vendor where we submit sales to them.
There is a chance we can submit the same sale twice by accident. We would then only have 1 sale in our system but the vendor would see 2 sales. How do we use EXCEPT to reconcile these kinds of differences?
March 25, 2014 at 4:35 am
I agree with Old Hand. I tried the comparison both with INNER JOINs and without. I got the same result in both instances.
March 25, 2014 at 4:37 am
I have created, inserted the values and run the query as posted in this article and it produces the same result as
SELECT .[ID]
, .[Item]
, .[Price]
, .[OrderDate]
, .[Units]
, .[ShipmentDate]
FROM [#Source]
EXCEPT
SELECT [T].[ID]
, [T].[Item]
, [T].[Price]
, [T].[OrderDate]
, [T].[Units]
, [T].[ShipmentDate]
FROM [#Target] [T]
Still therefore trying to understand the use of the join
March 25, 2014 at 5:06 am
the join is to restrict the comparison to those cases where you have the same key values present in both source and value - and are looking for differences (rather than present in one and not in another.)
Agreed that with the test data provided, this is not very apparent. if you add a row to the source table with an ID which is not present in the Target table, then you'll see the effect of the join.
March 25, 2014 at 5:07 am
you can use UNION also to do something similar. In this case, it returns both rows when there's a difference.
SELECT*
FROM(SELECT*
,count(1) OVER (PARTITION BY ID) AS versions
FROM(SELECT*
FROM#Source AS ss
UNION
SELECT*
FROM#Target AS ss
) x
) y
WHEREversions <> 1
March 25, 2014 at 5:22 am
Without the join, the except statement would also return rows in the source that do not exist in the target as well as those that exist (by id) in both sets but are different.
What I needed was to determine differences between rows that exist between both sets, but have differences.
Having a series of tables with 100, 200+ columns to compare is asking for human error, especially when nulls need to be taken into account.
March 25, 2014 at 5:28 am
thank you for your explanation.
having rows in the source that do not exist in the target as you describe is precisely why the join exists and the test data does not reflect that.
March 25, 2014 at 6:20 am
In other words, the article shows how to use [font="Courier New"]EXCEPT[/font] to do something really cool. Compare records with matching primary keys to see if they are identical. But the example doesn't demonstrate the cool part. OK, I get it.
March 25, 2014 at 6:32 am
This is a much better approach than comparing column by column.
---------------
Mel. 😎
March 25, 2014 at 6:37 am
nick.latocha (3/25/2014)
Lets say we have a reconciliation against a 3rd party vendor where we submit sales to them.There is a chance we can submit the same sale twice by accident. We would then only have 1 sale in our system but the vendor would see 2 sales. How do we use EXCEPT to reconcile these kinds of differences?
When I was doing something similar (but used column by column compares so not as nice as this method), and the tables I compared had a chance of duplication, I added a column that was a count of rows, and did a group by everything to achieve it. Then this extra 'count' column could also be compared.
Could be interesting to hear of an easier method, but thats what I came up with!
March 25, 2014 at 6:50 am
Nice article Douglas. I though you were crazy until I tried your code compared to what I had in my head. Pretty cool! I am faced with nearly an identical challenge today. Think I'll give it a try.
Chris Umbaugh
Data Warehouse / Business Intelligence Consultant
twitter @ToledoSQL
Viewing 15 posts - 1 through 15 (of 72 total)
You must be logged in to reply to this topic. Login to reply