January 12, 2015 at 12:29 pm
In most cases where we have to find delta the source and target tables aren't in same database often on different databases. Will Except perform same way if the source query have a link server ? And how would we do this in SSIS ?
January 13, 2015 at 10:54 am
What about using TABLEDIFF.exe? BOL: http://msdn.microsoft.com/en-us/library/ms162843.aspx
The necessary command and parameters can easily be created programmatically with a simple T-SQL script. TABLEDIFF can also generate the statements to bring the destination table into convergence with the source table with the -f parameter.
For example:
tablediff.exe -sourceserver . -sourcedatabase Credit -sourceschema dbo -sourcetable Source -destinationserver . -destinationdatabase Credit -destinationschema dbo -destinationtable Target -f C:\TEMP\Target.sql
In this example:
- the comparison occurs on the same SQL Server instance between two tables, Source and Target, in the Credit database. Tablediff can compare between the same database or different databases on the same instance or different instances as defined in the respective parameters
- -f generates the fix file to bring the destination table into convergence with the source table
Yes there are some limitations:
- can't be used with non-SQL servers
- sql_variant datatype is not supported
- allowable mappings between source and destination datatypes, see BOL link above
- the source table must have a primary key, -strict parameter requires the same in the destination table
- the script file to bring the destination table into convergence will not include the following datatypes:
varchar(max)
nvarchar(max)
varbinary(max)
timestamp
xml
text
ntext
image
I used this just last week to fix an accidental change of production data. Luckily the change was on the primary server in a log shipping setup. The t-log backups are only applied on the logged shipped server early in the morning daily. Using TABLEDIFF allowed me to bring the database on the primary server into convergence with the log-shipped standby database.
January 13, 2015 at 5:26 pm
We had this situation comparing two copies of a history table. We had to compare the natural key less than history_id where the natural key was not unique. My solution was to generate a sequence number using the row aggregate function with the sequence order determined by either the history_id or a timestamp. The comparison then included all columns except the history_id and including the sequence number. EXCEPT is great.
January 13, 2015 at 5:28 pm
I found that Varchar(Max) and blob type values had to be specially handled. Our solution was to only compare the first 8000 characters. Also timestamp columns need different handling because the two copies of the table used a different SQL-Server ODBC driver so rounded differently.
January 14, 2015 at 8:26 am
Is EXCEPT better, or worse, than using:
Select t1.col1, t1.col2, t1.col3
from table1 t1
where t1.col1 not in (select t2.col1 from table2 t2
where t2.col1 = t1.col1
and t2.col2 = t1.col2
and t2.col3 = t1.col3
)
Wallace Houston
Sunnyland Farms, Inc.
"We must endeavor to persevere."
January 14, 2015 at 10:57 am
EXCEPT is much faster and finds differences in the attributes not just differences in which primary keys exist.
One thing I forgot to note. To find the differences between two sets the EXCEPT needs to be performed both ways with results union-ed because either set may contain rows that the other does not.
That is something like:
Select *
From
(
select 'InA-NotB' As
, <all columns to compare>
from
(
Select a.<all columns to compare>, <logic to add sequence number>
From table1 As a
Except
Select b.<all columns to compare>, <logic to add sequence number>
From table2 As b
) As x
) As [InA-NotB]
Union All
(
select 'InB-NotA' As
, <all columns to compare>
from
(
Select b.<all columns to compare>, <logic to add sequence number>
From table2 As b
Except
Select a.<all columns to compare>, <logic to add sequence number>
From table1 As a
) As y
) As [InB-NotA]
) As [Diff]
order by <desired order>
January 18, 2015 at 8:35 pm
Hi and thanks for the article.
I also use EXCEPT to investigate why two queries don't return the same amount of rows.
with somerows (ListOfColumns)
(
/// first query
Select .......
)
, AllRows (ListOfColumns)
(
//Second query that returns more results than the first one
Select ......
)
SELECT ListOfColumns FROM AllRows EXCEPT SELECT ListOfColumns FROM SomeRows
of course ListOfColumns must be the same list 🙂
Thanks
jose
March 10, 2015 at 12:48 pm
EXCEPT is awesome for Row Differences between 2 sets, but I would caution that if you have a pk (like maybe in comparing objects between a publisher and subscriber in replication) and need to find rows existing in 1 set but not the other, EXCEPT will not be the best performance option, because it must generate both sets before describing the difference.
Alternatively, NOT EXISTS will yield a much speedier result across the pk:
--Rows in table A, not in table B
SELECT
A.*
FROM [dbo].[SomeTable] A
WHERE NOT EXISTS
(
SELECT
TOP 1 1
FROM [dbo].[SomeOtherTable] B
WHERE B.[PK] = A.[PK]
)
Josh Lewis
July 14, 2015 at 3:50 pm
Exactly what I needed to complete my project. Thank you very much!
February 17, 2017 at 11:31 am
enrique.pessoa - Friday, January 9, 2015 6:53 AMRegarding the query optimization, only the first join is needed, it ensures you are only taking the Source items that are also found on Target. The second one is not necessary.Actually, on this situation you could even avoid using joins and table aliases:SELECT ID , [Item] , [Price] , [OrderDate] , [Units] , [ShipmentDate] FROM [#Source] WHERE [ID] IN (SELECT [ID] FROM #Target)EXCEPTSELECT [ID] , [Item] , [Price] , [OrderDate] , [Units] , [ShipmentDate] FROM [#Target]
I second this approach, although I think I'd use a correlated subquery. It just is more self-documenting as to what's intended. The join(s) alone, IMO, somewhat conceals the intent of only finding unmatched data which matches on primary key. Maybe it's just because Enrique's suggestion "reads" more like a sentence to me:
SELECT .[ID]
, .[Item]
, .[Price]
, .[OrderDate]
, .[Units]
, .[ShipmentDate]
FROM [#Source]
WHERE EXISTS (SELECT *
FROM [#Target] [T]
WHERE .[ID] = [T].[ID])
EXCEPT
SELECT [T].[ID]
, [T].[Item]
, [T].[Price]
, [T].[OrderDate]
, [T].[Units]
, [T].[ShipmentDate]
FROM [#Target] [T];
The following would be even more self-documenting, but, it would require additional overhead in running an unnecessary EXISTS test in the second query:
SELECT
.[ID]
, .[Item]
, .[Price]
, .[OrderDate]
, .[Units]
, .[ShipmentDate]
FROM [#Source]
WHERE EXISTS (SELECT *
FROM [#Target] [T]
WHERE .[ID] = [T].[ID])
EXCEPT
SELECT [T].[ID]
, [T].[Item]
, [T].[Price]
, [T].[OrderDate]
, [T].[Units]
, [T].[ShipmentDate]
FROM [#Target] [T];
WHERE EXISTS (SELECT *
FROM [#Source]
WHERE [T].[ID] = .[ID]);
--=Chuck
February 17, 2017 at 12:58 pm
If you accept to have differences in one row you could do something like:SELECT T.*, S.*
FROM #Targer AS T
INNER JOIN #Source AS S
WHERE EXISTS (
SELECT T.*
EXCEPT
SELECT S.*
);
or equivalently:SELECT T.*, S.*
FROM #Targer AS T
INNER JOIN #Source AS S
WHERE NOT EXISTS (
SELECT T.*
INTERSECT
SELECT S.*
)
/*I used asterisk to shorten the code but you would likely use column lists instead.*/
March 20, 2018 at 5:30 pm
Thank you for this informative article!
I have tested this out on one of my own and would like know how to take the results and use them in an update or merge statement in sql server.
I am trying to update records that have changed or are new in a table from a separate temp table that i have generated from an xml file.
So the tables are identical in structure but one has changed or new data.
Any updates i have tried have changed all of the records with a matching "Client_Rec_Id"
Any help would be appreciated as i have done quite a bit of searching with no luck.
cheers Pete
select tmp.Client_Rec_Id, tmp.PackageName, tmp.SectionName, tmp.[Name], tmp.ServiceFlag, tmp.Amount, tmp.ProductCategory, tmp.ProductSubCategory, tmp.ProductName, tmp.ProductDescription, tmp.ProductSupplier, tmp.PrePaidPN, tmp.ContractLastModified FROM tmpContractLineAmounts tmp
inner join ContractLineAmounts cla
on tmp.Client_Rec_Id = cla.Client_Rec_Id
except
select cla.Client_Rec_Id, cla.PackageName, cla.SectionName, cla.[Name], cla.ServiceFlag, cla.Amount, cla.ProductCategory, cla.ProductSubCategory, cla.ProductName, cla.ProductDescription, cla.ProductSupplier, cla.PrePaidPN, cla.ContractLastModified FROM ContractLineAmounts cla
inner join tmpContractLineAmounts tmp
on cla.Client_Rec_Id = tmp.Client_Rec_Id
March 22, 2018 at 7:23 am
I have a similar process where I have to determine if fields have changed for a record and then do something with the new data. I use a merge statement and capture the output in a delta table that stores the old and new values and the action (INSERT , UPDATE, DELETE) that was performed during the merge. My merge looks like this:
MERGE
DestinationTable AS tgt
USING SourceTableOrQuery AS src
ON tgt.KeyField1 = src.KeyField1
AND tgt.KeyField2 = src.KeyField2
WHEN MATCHED AND
(ISNULL(tgt.StringField1, '') <> ISNULL(src.StringField1, '')
ORISNULL(tgt.StringField2,'') <> ISNULL(src.StringField2, '')
ORISNULL(tgt.NumericField1,0) <> ISNULL(src.NumericField1,0)
OR ISNULL(tgt.NumericField2,999) <> ISNULL(src.NumericField2,999)
OR ISNULL(tgt.DateField1,'1900-01-01') <> ISNULL(src.DateField1,'1900-01-01')
) THEN
UPDATE SET
tgt.StringField1 = src.StringField1,
tgt.StringField2 = src.StringField2,
tgt.NumericField1 = src.NumericField1,
tgt.NumericField2 = src.NumericField2,
tgt.DateField1 = src.DateField1
WHEN NOT MATCHED BY SOURCE THEN
DELETE
WHEN NOT MATCHED THEN
INSERT(KeyField1
,KeyField2
,StringField1
,StringField2
,NumericField1
,NumericField2
,DateField1)
VALUES
(src.KeyField1
,src.KeyField2
,src.StringField1
,src.StringField2
,src.NumericField1
,src.NumericField2
,src.DateField1)
OUTPUTgetdate(),
$action,
deleted.KeyField1,
inserted.KeyField1,
deleted.KeyField2,
inserted.KeyField2,
deleted.StringField1, --orig value
inserted.StringField1, -- new value
deleted.StringField2,
inserted.StringField2,
deleted.NumericField1,
inserted.NumericField1,
deleted.NumericField2,
inserted.NumericField2,
deleted.DateField1,
inserted.DateField1
INTO DELTA_Table;
Viewing 13 posts - 61 through 72 (of 72 total)
You must be logged in to reply to this topic. Login to reply