September 30, 2011 at 4:04 pm
Hi,
We have a double data entry process and want to compare the entries of one record in Table A to the duplicated entries in Table B (identical to TableA). What I want to do is compare a record from each table by an uniqueID and retain only columns where the values are not equal. I want to keep these results and save them to a table (TableC). TableC does not need to be defined in the solution here.
I could easliy do this in a VBA script or, dare I say...a CURSOR!!!!!!, but I'd like to see if there is a better way and keep this back on the server and out of the application.
I've looked a INTERSECT and EXCEPT, but they return the both whole records.
Thanks in advance,
September 30, 2011 at 4:20 pm
http://msdn.microsoft.com/en-us/library/ms180026.aspx
This should do.
September 30, 2011 at 4:25 pm
Roughly:
DELETE a
OUTPUT tblc
FROM
tblA as a
JOIN
tblB as b
ON a.col1 = b.col1 AND a.col2 = b.col2 ....
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
October 1, 2011 at 6:10 pm
mapperhd (9/30/2011)
Hi,We have a double data entry process and want to compare the entries of one record in Table A to the duplicated entries in Table B (identical to TableA). What I want to do is compare a record from each table by an uniqueID and retain only columns where the values are not equal. I want to keep these results and save them to a table (TableC). TableC does not need to be defined in the solution here.
I could easliy do this in a VBA script or, dare I say...a CURSOR!!!!!!, but I'd like to see if there is a better way and keep this back on the server and out of the application.
I've looked a INTERSECT and EXCEPT, but they return the both whole records.
Thanks in advance,
A small example, is this what you are looking for? Or something else - your question is not 100% clear.
INSERT INTO tableC(empID, aName, bName, aSalary, bSalary)
SELECT a.empID,a.empName,b.empName,a.empSalary,b.empsalary
FROM tableA a
INNER JOIN tableB b
ON a.empID = b.empid
WHERE a.empName <> b.empName or a.empSalary <> b.empSalary
October 2, 2011 at 8:58 am
Thanks for the replies.
However, I need to keep only the columns that are not equal. Let's say there are 12 columns in the two records being compared. If columns 4 and 9 in record 1 do not have equal values to the same columns in record 2 how do I keep only those two columns (and the ID) to insert into another table?
Sorry if I wan't very clear in my first post and I hope this helps to clarify.
Thanks again for responses.
October 3, 2011 at 1:34 am
mapperhd (10/2/2011)
Thanks for the replies.However, I need to keep only the columns that are not equal. Let's say there are 12 columns in the two records being compared. If columns 4 and 9 in record 1 do not have equal values to the same columns in record 2 how do I keep only those two columns (and the ID) to insert into another table?
Sorry if I wan't very clear in my first post and I hope this helps to clarify.
Thanks again for responses.
You could potentially end up with a lot of tables with differences that way: you would need a new table for every unique combination of columns with differences in it. Are you sure you want to have just the columns with differences in it? And not simply the entire rows that you've found to contain at least one difference? Or do you just want the names of the columns that have differences for each set of 2 rows that has at least one different column between them?
The last one is fairly easy (I have created some DDL plus sample data for you, please have a look at the link in my footer text for more info on how and WHY to provide this yourself the next time you have a question):
declare @table1 table (
ID int not null primary key,
col1 varchar(10) not null,
col2 varchar(200) null,
col3 int not null,
col4 datetime null
);
declare @table2 table (
ID int not null primary key,
col1 varchar(10) not null,
col2 varchar(200) null,
col3 int not null,
col4 datetime null
);
insert @table1(ID, col1, col2, col3, col4)
select 1, 'test1', 'some longer value', 23, {d '1900-01-01'} union all
select 2, 'test2', 'some longer value', 23, {d '1900-01-01'} union all
select 3, 'test3', 'some longer value', 23, {d '1900-01-01'}
insert @table2(ID, col1, col2, col3, col4)
select 1, 'test1', 'some longer value', 23, {d '1900-01-01'} union all
select 2, 'test21', 'some longer value', 23, {d '1900-01-02'} union all
select 3, 'test4', 'another longer value', 22, {d '1900-01-01'} union all
select 4, 'test4', 'another longer value', 22, {d '1900-01-01'}
select diff.column_names, t1.*, t2.*
from @table1 t1
full outer join @table2 t2 on (t2.ID = t1.ID)
cross apply (
select stuff((
select ', ' + t.name as [text()]
from (
select 'ID' as name where t1.ID is null or t2.ID is null
union all select 'col1' where not t1.col1 = t2.col1
union all select 'col2' where not ((t1.col2 is null and t2.col2 is null) or (t1.col2 = t2.col2))
union all select 'col3' where not t1.col3 = t2.col3
union all select 'col4' where not ((t1.col4 is null and t2.col4 is null) or (t1.col4 = t2.col4))
) t
for xml path(''), type
).value('.','varchar(8000)'),1, 2, '') as column_names
) diff
where diff.column_names is not null
Which outputs:
column_namesIDcol1col2col3col4IDcol1col2col3col4
col1, col42test2some longer value231900-01-01 00:00:00.0002test21some longer value231900-01-02 00:00:00.000
col1, col2, col33test3some longer value231900-01-01 00:00:00.0003test4another longer value221900-01-01 00:00:00.000
IDNULLNULLNULLNULLNULL4test4another longer value221900-01-01 00:00:00.000
October 3, 2011 at 1:07 pm
mapperhd (10/2/2011)
Thanks for the replies.However, I need to keep only the columns that are not equal.
The problem with this is that it forces you to violate the first normal form for database design: All records must have the same number of fields.
For example, say you want to keep the record ID, the column name, and the values from each of the tables. With one column different, you want to keep a total of four columns; with two columns different, you want to keep a total of seven columns, etc. As you can see, this violates first normal form, because the records don't have the same number of fields.
There are three basic ways that you can resolve this situation.
1) Remove the spec to keep only columns that are different, so that you are keeping the same number of columns for each record.
2) Store the column difference information in a single XML field, since XML doesn't have the same constraints as a database. (The problem with this being that you will still need to revert it to first normal form in order to work with it further.)
3) "PIVOT" the information so that you have one row for each column that is different. So, if one column is different, you would have one row for that record; with two columns different, you would need two rows, etc. (Again, you would need to unpivot it to work with it further.)
Which approach you use depends on how you are planning to use this data. Certainly, the first approach is the simplest.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
March 21, 2012 at 8:50 pm
R.P.
Thank you, your post was exactly what I was looking for.
March 22, 2012 at 8:53 am
If it helps, here is a script I use that creates a script to run to find the differences between the records in the same table in two different databases (like if there is a restore of a previous backup to a test database name to check for differences in the databases).
It creates a script, so just copy the printed output to a new query window and run it.
/* ====================================================================================================================== */
/*
This script creates a script to find the differences in two tables from two databases on the same instance (like a
test version of a database, being tested by another group of people or with another program).
Also creates the scripts to find records in one table that are not in the other, for both tables.
After run, just copy the script that is printed to another query window and run it (after making any necessary adjustments to the code).
Values to SET:
===============
Just set the database context (just below), Schema Name, Table Name, and Database Names for the two databases (in the "Values to change by User" section).
*/
USE MainDatabase-- SET THIS context to the primary database for the extraction of the table definitions
GO
DECLARE @SN varchar(200)
DECLARE @TN varchar(200)
DECLARE @db1 varchar(200)
DECLARE @db2 varchar(200)
DECLARE @collist varchar(4000)
DECLARE @collist2 varchar(4000)
DECLARE @ordinalposition int
DECLARE @schemaname varchar(200)
DECLARE @tablename varchar(200)
DECLARE @dataelement varchar(200)
DECLARE @datatype varchar(200)
DECLARE @maxlen char(10)
DECLARE @prec char(10)
DECLARE @scale char(10)
DECLARE @comparevals varchar(8000)
DECLARE @ds varchar(8000)
DECLARE @wow varchar(400)
DECLARE @numelements int
/* ============================================================================================== */
/* Values to change by User */
SET @SN = 'dbo'--Schema Name
SET @TN = 'SomeTableName'--Table Name
SET @db1 = 'MainDatabase'--Database 1 (a)
SET @db2 = 'TestDatabase'--Database 2 (b)
/* ============================================================================================== */
SET @collist = ''--leave blank
SET @collist2 = ''--leave blank
SET NOCOUNT ON
SELECT ORDINAL_POSITION AS 'OrdinalPosition',
TABLE_SCHEMA AS 'SchemaName',
TABLE_NAME AS 'TableName',
COLUMN_NAME AS 'DataElement', DATA_TYPE AS 'DataType',
ISNULL(CONVERT(char(10),CHARACTER_MAXIMUM_LENGTH),'') AS 'MaxLen', ISNULL(CONVERT(char(10),NUMERIC_PRECISION),'') AS 'Prec', ISNULL(CONVERT(char(10),NUMERIC_SCALE),'') AS 'Scale',
CONVERT(varchar(200),'') as ColNames,
CONVERT(varchar(400),'') as CompareVals
INTO #dummytable1
FROM information_schema.columns gg
WHERE TABLE_SCHEMA=@sn AND TABLE_NAME = @TN
ORDER BY [SchemaName], TABLE_CATALOG, TABLE_NAME, ORDINAL_POSITION
SELECT @collist2=@collist2 + DataElement + ',' FROM #dummytable1 ORDER BY OrdinalPosition
SET @collist2 = SUBSTRING(@collist2,1,LEN(@collist2)-1)
UPDATE #dummytable1
SET ColNames = 'a.' + DataElement + ' AS a' + DataElement + ',b.' + DataElement + ' AS b' + DataElement + ','
SELECT @collist=@collist + ColNames FROM #dummytable1 ORDER BY OrdinalPosition
SET @collist = SUBSTRING(@collist,1,LEN(@collist)-1)
SELECT @numelements=MAX(OrdinalPosition) FROM #dummytable1
/* ============================================================================================== */
DECLARE backupFiles CURSOR FOR
SELECT OrdinalPosition,SchemaName,TableName,DataElement,DataType,MaxLen,Prec,Scale
FROM #dummytable1
ORDER BY OrdinalPosition
OPEN backupFiles
-- Loop through all the files for the database
FETCH NEXT FROM backupFiles INTO @ordinalposition,@schemaname,@tablename,@dataelement,@datatype,@maxlen,@prec,@scale
WHILE @@FETCH_STATUS = 0
BEGIN
SET @ds = ''
if @datatype = 'char' OR @datatype='nchar' or @datatype='varchar' or @datatype='nvarchar' or @datatype = 'varbinary'
SET @ds = 'ISNULL(a.' + @dataelement + ','''') <> ISNULL(b.' + @dataelement + ','''')'
if @datatype = 'datetime'
SET @ds = 'ISNULL(a.' + @dataelement + ',CONVERT(datetime,''01/01/1800'')) <> ISNULL(b.' + @dataelement +',CONVERT(datetime,''01/01/1800''))'
if @datatype = 'int' OR @datatype = 'smallint' OR @datatype = 'bigint' OR @datatype='tinyint'
SET @ds = 'ISNULL(a.' + @dataelement + ',0) <> ISNULL(b.' + @dataelement + ',0)'
if @datatype = 'decimal'
SET @ds = 'ISNULL(a.' + @dataelement + ',0.0) <> ISNULL(b.' + @dataelement + ',0.0)'
if @datatype = 'money'
SET @ds = 'ISNULL(a.' + @dataelement + ',0.0) <> ISNULL(b.' + @dataelement + ',0.0)'
if @datatype = 'bit'
SET @ds = 'ISNULL(a.' + @dataelement + ',0) <> ISNULL(b.' + @dataelement + ',0)'
if @datatype = 'text'
SET @ds = 'ISNULL(CONVERT(varchar(4000),a.' + @dataelement + '),'''') <> ISNULL(CONVERT(varchar(4000),b.' + @dataelement + '),'''')'
if @ds = '' --AND @datatype <> 'text'
SET @ds = 'a.' + @dataelement + ' <> b.' + @dataelement
if @ds <> ''
BEGIN
UPDATE #dummytable1
SET CompareVals=@ds
WHERE OrdinalPosition=@ordinalPosition
END
FETCH NEXT FROM backupFiles INTO @ordinalposition,@schemaname,@tablename,@dataelement,@datatype,@maxlen,@prec,@scale
END
CLOSE backupFiles
DEALLOCATE backupFiles
/* ============================================================================================== */
PRINT ' '
PRINT ' '
PRINT '-- Check for different values for the records that are common to both databases (a=' + @db1 + ',b=' + @db2 + ') for table ' + @SN + '.' + @TN
SET @wow = (SELECT DataElement FROM #dummytable1 WHERE OrdinalPosition=1)
PRINT 'SELECT ' + @collist
PRINT 'FROM ' + @db1 + '.' + @SN + '.' + @TN + ' a JOIN ' + @db2 + '.' + @SN + '.' + @TN + ' b'
PRINT 'ON a.' + @wow + ' = b.' + @wow + ' --change this PK if necessary'
PRINT 'WHERE'
DECLARE backupFiles CURSOR FOR
SELECT OrdinalPosition,SchemaName,TableName,DataElement,DataType,MaxLen,Prec,Scale,CompareVals
FROM #dummytable1
ORDER BY OrdinalPosition
OPEN backupFiles
-- Loop through all the files for the database
FETCH NEXT FROM backupFiles INTO @ordinalposition,@schemaname,@tablename,@dataelement,@datatype,@maxlen,@prec,@scale,@comparevals
WHILE @@FETCH_STATUS = 0
BEGIN
if @ordinalposition=@numelements
SET @ds = @comparevals
else
SET @ds = @comparevals + ' OR'
if @comparevals <> ''
PRINT @ds
FETCH NEXT FROM backupFiles INTO @ordinalposition,@schemaname,@tablename,@dataelement,@datatype,@maxlen,@prec,@scale,@comparevals
END
CLOSE backupFiles
DEALLOCATE backupFiles
--SELECT * FROM #dummytable1
PRINT ''
PRINT '--Find records in db ' + @db1 + ' that are not in db ' + @db2
PRINT 'SELECT ' + @collist2
PRINT 'FROM ' + @db1 + '.' + @SN + '.' + @TN
PRINT 'WHERE ' + @wow + ' NOT IN (SELECT a.' + @wow + ' FROM ' + @db2 + '.' + @SN + '.' + @TN + ' a)'
PRINT ''
PRINT ''
PRINT '--Find records in db ' + @db2 + ' that are not in db ' + @db1
PRINT 'SELECT ' + @collist2
PRINT 'FROM ' + @db2 + '.' + @SN + '.' + @TN
PRINT 'WHERE ' + @wow + ' NOT IN (SELECT b.' + @wow + ' FROM ' + @db1 + '.' + @SN + '.' + @TN + ' b)'
DROP TABLE #dummytable1
SET NOCOUNT OFF
March 22, 2012 at 8:58 am
Note that the script assumes that the structure of the table in the two databases hasn't changed.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply