compare bit values without using multiple or's

Question

Post reply

compare bit values without using multiple or's

cantor

Say Hey Kid

Points: 667
More actions
March 15, 2012 at 4:18 pm

#253289

Hi, I feel I should know this but for the life of me I can't think of an elegant solution...I have 100+ bit datatype columns in two tables and I'd like to know the rows where they are different.
Is there a standard method so I can avoid using multiple OR statements ?
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[mysource]') AND type in (N'U'))
DROP TABLE [dbo].[mysource]
GO
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[mytarget]') AND type in (N'U'))
DROP TABLE [dbo].[mytarget]
GO
create table mysource(pk_id int, col1 bit, col2 bit, col3 bit, col103 bit)
go
create table mytarget(pk_id int, col1 bit, col2 bit, col3 bit, col103 bit)
go
insert mysource select 1000,1,0,1,1 union all
select 1001,1,1,0,1 union all
select 1002,1,1,0,1
insert mytarget select 1000,1,0,1,0 union all
select 1001,1,1,0,1 union all
select 1002,1,1,1,0
select * from mysource a
join mytarget b on a.pk_id = b.pk_id
where
a.col1 <> b.col1 or
a.col2 <> b.col2 or
a.col3 <> b.col3 or
a.col103 <> b.col103

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 1

Whats your desired output? which rows u want to see?

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 2

oooops.. i dint see your last select statement..

Hmmmm.. i cant think of any other way than individually checking each columns!

Possinator Hall of Fame Points: 3422 More actions · Answer 3

would this be a good use of Except?

Something like

select * from mytarget

except

select * from mysource

I've not had an opportunity use it myself.

[font="Courier New"]Looking for a Deadlock Victim Support Group..[/font]

cantor Say Hey Kid Points: 667 More actions · Answer 4

Yes, nice try however in this particular case its the except I'm trying to avoid. In addition to the 100+ bit type columns it also has 100+ 'text' datatype columns which proves to be a bit of a problem when running a comparison query across linked servers. We _could_ use except on only those columns we deem significant - if it weren't for the fact that the datatypes are different; 'text' in source is 'varchar(max)' in target, which would mean instead of just coding select col1, col2 except select col1, col2 we would have to cast each column first - which generates more code which I'm trying to avoid.

I guess I was hoping to be able to use some sort of transform based upon a bitwise operator - I'd still have to specify each column but at least I wouldn't have multiple OR lines.

i.e. <some_transform>(a.col1+a.col2+a.col3...+a.col103) <> <some_transform>(b.col1+b.col2+b.col3+b.col103) -- note that the '+' in this statement is not correct, I'm just using it for illustration - it would have to be something else

Its not a big drama - I just thought there would be some standard way to do it.

Edit:

Hmm, I wonder if I could generate a hash of the values ? Will look into this now...

cantor Say Hey Kid Points: 667 More actions · Answer 5

OK, for what its worth I've coded it like this...

-- updates

select

a.pk_id

from

mytarget a

join

mysource b

on a.pk_id = b.pk_id

where

(

-- 45 columns with non-significant data in

a.[col1]<>b.[col1]

orcast(b.[col2] as varchar(MAX))<> a.[col2]

...

or binary_checksum(a.[col56],a.[col57]...)

<>

binary_checksum(b.[col56],b.[col57]...)

)

It works and runs in about the same time (which surprised me) but frankly I don't trust it so I'm leaving all the OR statements in.

My last day of work today so don't want to rock the boat on the way out!

diamondgm SSCrazy Points: 2824 More actions · Answer 6

Not sure if this approach would be useful?

This method can also be used to determine which columns are alike.

IF OBJECT_ID('tempdb..#mysourcebitwise') IS NOT NULL DROP TABLE #mysourcebitwise

IF OBJECT_ID('tempdb..#mytargetbitwise') IS NOT NULL DROP TABLE #mytargetbitwise

create table #mysourcebitwise

(pk_id int,

col1 bit,

col2 bit,

col3 bit,

col4 bit,

binstate AS((col1 * POWER(2,3)) +

(col2 * POWER(2,2)) +

(col3 * POWER(2,1)) +

(col4 * POWER(2,0))

) PERSISTED

)

go

create table #mytargetbitwise

(pk_id int,

col1 bit,

col2 bit,

col3 bit,

col4 bit,

binstate AS((col1 * POWER(2,3)) +

(col2 * POWER(2,2)) +

(col3 * POWER(2,1)) +

(col4 * POWER(2,0))

) PERSISTED

)

go

insert #mysourcebitwise

select 1000,1,0,1,1 union all

select 1001,1,1,0,1 union all

select 1002,1,1,0,1

insert #mytargetbitwise

select 1000,0,0,1,0 union all

select 1001,1,1,0,1 union all

select 1002,1,1,1,0

select *

from #mysourcebitwise a

join #mytargetbitwise b on a.pk_id = b.pk_id

where a.binstate != b.binstate

cantor Say Hey Kid Points: 667 More actions · Answer 7

Yes, once I'd forced the left hand side to a float...

(col103 * POWER(2e1,102))

... this worked.

I guess at this point I'd be happy enough with the binary_checksum method - I'd accept a false positives (i.e. rows identified as diffs even if they are identical) as long as I could be sure that it wouldn't miss any that were truly diffs...

thanks

🙂

HowardW One Orange Chip Points: 29109 More actions · Answer 8

Tread with caution using binary_checksum. E.g. with 10 bit columns (1024 distinct combinations), there are only 256 distinct binary checksums:

WITH a AS ( SELECT CAST(0 AS BIT) val

UNION ALL

SELECT CAST(1 AS BIT) val

)SELECT

a.val col1 ,

b.val col2 ,

c.val col3 ,

d.val col4 ,

e.val col5 ,

f.val col6 ,

g.val col7 ,

h.val col8 ,

i.val col9 ,

j.val col10 ,

BINARY_CHECKSUM(*) bin_checksum

FROM a ,

a b ,

a c ,

a d ,

a e ,

a f ,

a g ,

a h ,

a i ,

a j

ORDER BY BINARY_CHECKSUM(*)

cantor Say Hey Kid Points: 667 More actions · Answer 9

...and it looks as though the situation doesn't get any better no matter how many columns you add you still get 256 distinct values - ah well, leave the OR's in and call it a day.

thanks to all for the input

🙂

Adam Haines SSC-Insane Points: 23217 More actions · Answer 10

I don't know if it is applicable here, but this may be a good candidate for CDC, if you are using 2008, to detect if a change has been made and update appropriately.

My blog: http://jahaines.blogspot.com

cantor Say Hey Kid Points: 667 More actions · Answer 11

Yes, we had just moved our environment to SQL2K8 (4 weeks ago), the source operational database is not within our area to control so CDC would have to be agreed to by some other part of the business.

Given that under SQL2K5 we have had to roll our own ETL which essentially just versions each row of the ODS then I think CDC or more probably change tracking would certainly be a more maintainable approach.

As I say I don't work there any more so the solution is now moot!

Perhaps I'll be able to use it in my next job?