Poor Man's Table Difference
The code contains a stored procedure and code to show how it works aftwerwards.
PURPOSE: I need to sync data between two databases. One database resides on a computer without an internet connection.
The plan is to create a text file containing information from the isolated computer's SQL-Express database that can be put on a flash drive. This procedure will provide that data. Once the user is at a internet-ready machine, another application can use the text document to pull down only changes from a website that has the master database. In my case, data goes one direction: to the isolated computer's database. If anything needs to ship the other direction, the user specifically selects it.
THEORY: The procedure builds a custom query which will group on the primary key, and create a checksum of all fields we care about.
The Checksum is created
- for each row: combining all fields into a text string and doing a HASHBYTES on that
- sum parts of the checksum into one bigint
You're welcome offer comments on it's design and thoughts about how unique the hash may end up being.
USAGE:
Should be easy to see how to use the procedure. The procedure expects
- @TableName can be a temp table or table in another database
- @PrimaryKey data type is flexible. You could even pass it two fields if needed, as in 'HalfAUniqueKey, OtherHalfOfUniqueKey'
- @FieldList list of fields (normally separated by a semicolon) that we care to check for changes. Data types can be mixed. I've tested it on numeric and some text fields.
- @FieldListDelimiter optional..you see the default is a semicolon
- @NullText optional..varchar of something the database will never see in reality
- @DEBUG optional..if set to 1, will print the SQL code rather than running it
The procedure returns a table, which can then be linked to other tables in the database for further processing. In my case, I link it to the table with the primary keys, and pull out the human-readable description rather than the ID.
Again, feel free to offer comments.
Kurt
create procedure spRowChecksum (
@TableName varchar(200),
@ForeignKey varchar(20),
@FieldList varchar(2000),
@FieldListDelimiter varchar(1) = ';',
@NullText varchar(20) = '~~',
@DEBUG bit = 0) as
declare @SQL nvarchar(4000)
-- Expand @FieldList to SQL code
set @SQL = '), ''' + @NullText +''')'
set @SQL =
replace(
replace(
replace(
';1' + replace(@FieldList, ';', ';3;2;1') + ';3',
';1',
'isnull(convert(varchar, '
),
';2',
' + '';'' + '
),
';3',
@SQL
)
-- fill in the rest
set @SQL =
'select
' + @ForeignKey + ',
sum(convert(bigint, convert(binary(4), substring(cksum, 1,4)))) +
sum(convert(bigint, convert(binary(4), substring(cksum, 5,4)))) +
sum(convert(bigint, convert(binary(4), substring(cksum, 9,4)))) +
sum(convert(bigint, convert(binary(4), substring(cksum,13,4)))) +
sum(convert(bigint, convert(binary(4), substring(cksum,17,4)))) +
sum(convert(bigint, convert(binary(4), substring(cksum,21,4)))) +
sum(convert(bigint, convert(binary(4), substring(cksum,25,4)))) +
sum(convert(bigint, convert(binary(4), substring(cksum,29,4)))) +
sum(convert(bigint, convert(binary(4), substring(cksum,33,4)))) +
sum(convert(bigint, convert(binary(4), substring(cksum,37,4)))) cksum
from
(select
' + @ForeignKey + ', HASHBYTES(''sha'', ' + @SQL + ') cksum
from ' + @TableName + ') data
group by ' + @ForeignKey
if @DEBUG = 1
print @SQL
else
exec (@SQL)
go
-------------------
-- Test code
create table #Table1 (pk int identity, fk int, field1 int, field2 varchar(10), lastfield char(2))
create table #Table2 (pk int identity, fk int, field1 int, field2 varchar(10), lastfield char(2))
insert into #Table1 (fk, field1, field2, lastfield) values (1, 4, 'testing1', 'x')
insert into #Table1 (fk, field1, field2, lastfield) values (2, 5, 'testing2', null)
insert into #Table1 (fk, field1, field2, lastfield) values (3, 60, 'testing3', 'z')
insert into #Table1 (fk, field1, field2, lastfield) values (4, 160, 'testing3', 'z')
insert into #Table2 (fk, field1, field2, lastfield) values (1, 4, 'testing1', 'x')
insert into #Table2 (fk, field1, field2, lastfield) values (2, 5, 'testing2', null)
insert into #Table2 (fk, field1, field2, lastfield) values (3, 62, 'testing3', 'z')
create table #ckTable1 (fk int, cksum bigint)
create table #ckTable2 (fk int, cksum bigint)
insert into #ckTable1
exec spRowChecksum '#Table1', 'fk', 'field1;field2;lastfield'
insert into #ckTable2
exec spRowChecksum '#Table2', 'fk', 'field1;field2;lastfield'
-- show mismatches
select *
from #ckTable1 t1
full outer join #ckTable2 t2
on t1.fk = t2.fk
where
t1.cksum != t2.cksum or
t1.fk is null or
t2.fk is null