Bitwise OR function

Question

Bitwise OR function

Forum Etiquette: How to post Reporting Services problems

[/url]

Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
[/url]

How to Post Performance Problems - by Gail Shaw
[/url]

joakim.fenno Say Hey Kid Points: 704 More actions · Answer 1

ChrisM@Work (3/18/2014)
Although just about every reference on the first page returned by Google specifies two operands, the TSQL BITWISE OR actually works with a chain like this:
-- on a single row
SELECT 001 | 000 | 010
Result
11
Which I think can be emulated like this:
-- try row-wise
DECLARE @LastValue INT
SELECT @LastValue = 001
SELECT @LastValue = @LastValue | d.MyValue
FROM (SELECT MyValue = 000 UNION ALL SELECT 010) d
-- eyeball check
SELECT @LastValue
Result
11
- which can be converted into a msTVF.

The output should be 011, although thats is something I can handle (converting 11 to 011).

The query needs to be used like the built in aggregate function (with the possibility to group by)

Sean Lange SSC Guru Points: 286590 More actions · Answer 2

joakim.fenno (3/18/2014)
ChrisM@Work (3/18/2014)
joakim.fenno (3/18/2014)
OK
CREATE TABLE my_table (col1 nvarchar(25) NULL, bit_flag nvarchar(3) NULL)
INSERT INTO my_table VALUES('A', '100'), ('A', '001'),('B', '010'),('A', '001');
expected output from query :
A 101
B 011
The bit_flag column contains 12 flag (and is therefore a nvarchar(12)) in the real example
You have three rows where col1 = 'A' and one row where col1 = 'B'. You have two rows in your result set, one for 'A' and one for 'B', and yet the result for 'B' is different to the value in the one row in the sample data. What is the third row in the sample data set, the row where col1 = 'B', OR'd with?
sorry, should be:
CREATE TABLE my_table (col1 nvarchar(25) NULL, bit_flag nvarchar(3) NULL)
INSERT INTO my_table VALUES('A', '100'), ('A', '001'),('B', '010'),('B', '001');

hhmmm....that looks surprisingly like what I posted.

And the results seem to match too.

Compare my last post and this.

select 100|001, 010|001

This seems to produce the exact same output.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

joakim.fenno Say Hey Kid Points: 704 More actions · Answer 3

Sean Lange (3/18/2014)
joakim.fenno (3/18/2014)
ChrisM@Work (3/18/2014)
joakim.fenno (3/18/2014)
OK
CREATE TABLE my_table (col1 nvarchar(25) NULL, bit_flag nvarchar(3) NULL)
INSERT INTO my_table VALUES('A', '100'), ('A', '001'),('B', '010'),('A', '001');
expected output from query :
A 101
B 011
The bit_flag column contains 12 flag (and is therefore a nvarchar(12)) in the real example
You have three rows where col1 = 'A' and one row where col1 = 'B'. You have two rows in your result set, one for 'A' and one for 'B', and yet the result for 'B' is different to the value in the one row in the sample data. What is the third row in the sample data set, the row where col1 = 'B', OR'd with?
sorry, should be:
CREATE TABLE my_table (col1 nvarchar(25) NULL, bit_flag nvarchar(3) NULL)
INSERT INTO my_table VALUES('A', '100'), ('A', '001'),('B', '010'),('B', '001');
hhmmm....that looks surprisingly like what I posted.
And the results seem to match too.
Compare my last post and this.
select 100|001, 010|001
This seems to produce the exact same output.

Yes, I made a typo.

See my comments about your query in my previous post.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

Here's a solution which will work with many values. First the UDF:

ALTER FUNCTION fn_BITWISE_OR

(

@col1 VARCHAR(3)

)

RETURNS VARCHAR(3)

AS

BEGIN

-- try row-wise

DECLARE @LastValue INT

SELECT @LastValue = 000

SELECT @LastValue = @LastValue | d.bit_flag

-- substitute this for your table

FROM (VALUES('A', '100'), ('A', '001'),('A', '010'),('B', '001')) d (col1, bit_flag )

-- substitute this for your table

WHERE col1 = @col1 -- filter

RETURN RIGHT('000'+CAST(@LastValue AS VARCHAR(3)),3)

END

Then, usage like this:

SELECT d.col1, dbo.fn_BITWISE_OR(d.col1)

FROM (VALUES('A', '100'), ('A', '001'),('A', '010'),('B', '001')) d (col1, bit_flag )

GROUP BY col1

In both cases you will need to switch the sample data set for your actual table name, and you will also need to play with the data type, currently set to VARCHAR(3).

Note that this is a scalar udf (!). It will prevent parallelism in all or part of the plan and will be executed for each row, hence performance could become a problem.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

joakim.fenno Say Hey Kid Points: 704 More actions · Answer 5

ChrisM@Work (3/18/2014)
Here's a solution which will work with many values. First the UDF:
ALTER FUNCTION fn_BITWISE_OR
(
@col1 VARCHAR(3)
)
RETURNS VARCHAR(3)
AS
BEGIN
-- try row-wise
DECLARE @LastValue INT
SELECT @LastValue = 000
SELECT @LastValue = @LastValue | d.bit_flag
-- substitute this for your table
FROM (VALUES('A', '100'), ('A', '001'),('A', '010'),('B', '001')) d (col1, bit_flag )
-- substitute this for your table
WHERE col1 = @col1 -- filter
RETURN RIGHT('000'+CAST(@LastValue AS VARCHAR(3)),3)
END
Then, usage like this:
SELECT d.col1, dbo.fn_BITWISE_OR(d.col1)
FROM (VALUES('A', '100'), ('A', '001'),('A', '010'),('B', '001')) d (col1, bit_flag )
GROUP BY col1
In both cases you will need to switch the sample data set for your actual table name, and you will also need to play with the data type, currently set to VARCHAR(3).
Note that this is a scalar udf (!). It will prevent parallelism in all or part of the plan and will be executed for each row, hence performance could become a problem.

Thanks!

I'm aware that scalar functions prevents parallel execution plans but thats somethign that I will have to deal with later on.

I will check later if it works but I doubt that its optimal in my case.

It is important that I can groupo by a number of columns and that the group by is dynamic (set of columns might differ and might be none as well).

It is good that you implement it as a function because it will be used in many different places.

However the function will be used for different tables (at least three different) so I guess it has to be more generic (table name cant be hard coded in the function).

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 6

joakim.fenno (3/18/2014)
The bit_flag column contains 12 flag (and is therefore a nvarchar(12)) in the real example

Gosh. That's 24+2 or 26 bytes to store what could be stored in less than 2 (8 bits per byte). Is there some reason why the designers of this table decided to store binary data in NVARCHAR instead of BINARY(2) or even separate bit columns which allows for 8 separate bit column per byte. It would also make bit-wise coding a whole lot easier if BINARY(2) were used.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

joakim.fenno Say Hey Kid Points: 704 More actions · Answer 7

Jeff Moden (3/18/2014)
joakim.fenno (3/18/2014)
The bit_flag column contains 12 flag (and is therefore a nvarchar(12)) in the real example
Gosh. That's 24+2 or 26 bytes to store what could be stored in less than 2 (8 bits per byte). Is there some reason why the designers of this table decided to store binary data in NVARCHAR instead of BINARY(2) or even separate bit columns which allows for 8 separate bit column per byte. It would also make bit-wise coding a whole lot easier if BINARY(2) were used.

I know that and I am willing to use a binary data type instead if that makes the implementation easier.

Each bit flag represent a month and some queries might be executed using the more user friendly string field, at least in the short run.

But let’s assume that a binary data type can be used

mister.magoo SSC-Forever Points: 47174 More actions · Answer 8

Here are three ways you could tackle this. I prefer number three, which is to store the month number as a tinyint.

They all produce the same result, but query 2 seems to produce the simplest plan.

use tempdb

--== the "nvarchar" version

begin try

create table my_table (col1 nvarchar(25) null, bit_flag nvarchar(3) null)

insert into my_table values('a', '100'), ('a', '001'),('b', '010'),('b', '001');

end try

begin catch

end catch;

with split_data as

(

select col1, x.n, max(x.v ) as bit_value

from my_table

cross apply (

select n, case substring(bit_flag,n,1)

when '1' then 1

else 0

end

from (values(1),(2),(3)) t(n)

) x(n,v)

group by col1, x.n

)

select col1,right('000'+cast(sum( power(10,3-n)*bit_value ) as varchar(3)),3) as col_value

from split_data

group by col1;

--== the "int" flag version

begin try

create table my_table2 (col1 nvarchar(25) null, int_flag int null)

insert into my_table2 values('a', 100), ('a', 001),('b', 10),('b', 1);

end try

begin catch

end catch;

with split_data as

(

select col1, int_flag

from my_table2

group by col1, int_flag

)

select col1,right('000'+cast(sum( int_flag ) as varchar(3)),3) as col_value

from split_data

group by col1;

--== the "tinyint" meaningful month number version

begin try

create table my_table3 (col1 nvarchar(25) null, month_number tinyint null)

insert into my_table3 values('a', 1), ('a', 3),('b', 2),('b', 1);

end try

begin catch

end catch;

with split_data as

(

select col1, month_number

from my_table3

group by col1, month_number

)

select col1,right('000'+cast(sum( power(10,month_number-1) ) as varchar(3)),3) as col_value

from split_data

group by col1;

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

joakim.fenno Say Hey Kid Points: 704 More actions · Answer 9

mister.magoo (3/18/2014)
Here are three ways you could tackle this. I prefer number three, which is to store the month number as a tinyint.
They all produce the same result, but query 2 seems to produce the simplest plan.
use tempdb
--== the "nvarchar" version
begin try
create table my_table (col1 nvarchar(25) null, bit_flag nvarchar(3) null)
insert into my_table values('a', '100'), ('a', '001'),('b', '010'),('b', '001');
end try
begin catch
end catch;
with split_data as
(
select col1, x.n, max(x.v ) as bit_value
from my_table
cross apply (
select n, case substring(bit_flag,n,1)
when '1' then 1
else 0
end
from (values(1),(2),(3)) t(n)
) x(n,v)
group by col1, x.n
)
select col1,right('000'+cast(sum( power(10,3-n)*bit_value ) as varchar(3)),3) as col_value
from split_data
group by col1;
--== the "int" flag version
begin try
create table my_table2 (col1 nvarchar(25) null, int_flag int null)
insert into my_table2 values('a', 100), ('a', 001),('b', 10),('b', 1);
end try
begin catch
end catch;
with split_data as
(
select col1, int_flag
from my_table2
group by col1, int_flag
)
select col1,right('000'+cast(sum( int_flag ) as varchar(3)),3) as col_value
from split_data
group by col1;
--== the "tinyint" meaningful month number version
begin try
create table my_table3 (col1 nvarchar(25) null, month_number tinyint null)
insert into my_table3 values('a', 1), ('a', 3),('b', 2),('b', 1);
end try
begin catch
end catch;
with split_data as
(
select col1, month_number
from my_table3
group by col1, month_number
)
select col1,right('000'+cast(sum( power(10,month_number-1) ) as varchar(3)),3) as col_value
from split_data
group by col1;

Thanks!

Lets start with the "int" flag version, its does not produce the correct output if you add a value 101 (menas true for month 1, false for month 2, true for month 3) for the int_flag:

begin try

drop table my_table2

create table my_table2 (col1 nvarchar(25) null, int_flag int null)

insert into my_table2 values('a', 101), ('a', 1),('a', 1),('b', 1),('a', 10),('b', 1);

end try

begin catch

end catch;

select * from my_table2

order by col1

go

with split_data as

(

select col1, int_flag

from my_table2

group by col1, int_flag

)

select col1,right('000'+cast(sum( int_flag ) as varchar(3)),3) as col_value

from split_data

group by col1;

Or did I misunderstand your implementation?

1 = 001?

10 = 010?

100 = 100?

It needs to support all combination of 0 and 1

It seems like the "nvarchar" version works better.

I dont really understand the "tinyint" flag version, what do the values for the flag mean?

mister.magoo SSC-Forever Points: 47174 More actions · Answer 10

joakim.fenno (3/19/2014)
Thanks!

You're welcome 🙂

Lets start with the "int" flag version, its does not produce the correct output if you add a value 101 (menas true for month 1, false for month 2, true for month 3) for the int_flag:
...
It needs to support all combination of 0 and 1

Well, if you have other requirements than the one stated - where each value only contained a flag in one position it would help if you disclosed ALL the requirements rather than drip feeding them.

It seems like the "nvarchar" version works better.
I dont really understand the "tinyint" flag version, what do the values for the flag mean?

Ignore it, now that we know you can have more than one flag in a single data item, that one is useless.

As you said, the nvarchar version will work for that.

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

joakim.fenno Say Hey Kid Points: 704 More actions · Answer 11

mister.magoo (3/19/2014)
joakim.fenno (3/19/2014)
Thanks!
You're welcome 🙂
Lets start with the "int" flag version, its does not produce the correct output if you add a value 101 (menas true for month 1, false for month 2, true for month 3) for the int_flag:
...
It needs to support all combination of 0 and 1
Well, if you have other requirements than the one stated - where each value only contained a flag in one position it would help if you disclosed ALL the requirements rather than drip feeding them.
It seems like the "nvarchar" version works better.
I dont really understand the "tinyint" flag version, what do the values for the flag mean?
Ignore it, now that we know you can have more than one flag in a single data item, that one is useless.
As you said, the nvarchar version will work for that.

Thanks!

Sorry for being unclear about the requirements.

I will see if I can change the int-version to support my needs - I will, at least in the long run, have to use other data types for prestanda and storage reasons.

One thing though, as mentioned above I would really like to implement this as a function since it will be used in multiple places on multiple tables. And also if I want to change the columns I would like to group on I would prefer not change the code in more than one place. That’s why I thought about an aggregate CLR in the first place.

Nadrek SSC-Insane Points: 20039 More actions · Answer 12

From what it sounds like, you're not going to be indexing these values, just doing things based on the results. If there really are only 12 bits (months), then you might also want to consider 12 BIT columns with month names.

Jan BIT

,Feb BIT

,Mar BIT

...

,Dec BIT

and normal logical OR statements.

You get serious gains in code readability, at the expense of a little tedious (or macro-driven 🙂 ) code generation. Performance against the other techniques you'd have to test, of course.

Alternately, in this case, a simple lookup table with 2^24 (16777216) rows containing three BINARY(2) columns and a clustered index on the "input" values would also work. If you have enough RAM to buffer 96MB of data plus the usual overhead, that might be very interesting to benchmark against the other solutions.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 13

Here's a sandpit version very similar to what you currently have. The only downside is the ugly aggregate - but it's a very simple ugly aggregate. I guess it depends upon your priorities. You do get your 12 months saved as a smallint representation of a bit string.

DROP TABLE my_table2

CREATE TABLE my_table2 (col1 nvarchar(25) null, int_flag smallint NULL)

G0

INSERT INTO my_table2 values

('a', (SELECT DecimalOut FROM dbo.CHARBitsToDecimal('000000000100'))),

('a', (SELECT DecimalOut FROM dbo.CHARBitsToDecimal('000000000001'))),

('b', (SELECT DecimalOut FROM dbo.CHARBitsToDecimal('000000000010'))),

('b', 2),

('b', 1),

('b', 4);

CREATE CLUSTERED INDEX cx_col1 ON my_table2 (col1)

-- Stare & compare

SELECT * FROM my_table2

-- BITWISE OR aggregate emulation

SELECT col1,

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),1,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),2,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),3,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),4,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),5,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),6,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),7,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),8,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),9,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),10,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),11,1)) AS CHAR(1))+

CAST(MAX(SUBSTRING(RIGHT(x.CHARBitsOut,12),12,1)) AS CHAR(1))

FROM my_table2 t

CROSS APPLY dbo.DecimalToCHARBits (t.int_flag) x

GROUP BY col1;

---------------------------------------------------------------------

-- check the functions

SELECT DecimalOut FROM [dbo].[CHARBitsToDecimal] ('111011110111')

-- 3831 -- SMALLINT, two bytes

SELECT RIGHT(CHARBitsOut,12) FROM [dbo].[DecimalToCHARBits] (3831) -- SMALLINT, two bytes

-- '111011110111'

CREATE FUNCTION [dbo].[CHARBitsToDecimal]

(@BinVal VARCHAR(32))

RETURNS TABLE WITH SCHEMABINDING AS

/*

ChrisM March 2014

CHARBitsToDecimal could output SMALLINT.

*/

RETURN

SELECT DecimalOut = SUM(POWER(2,iTally.n-1))

FROM (SELECT n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

FROM (VALUES

(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),

(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)

) iTally

WHERE SUBSTRING(REVERSE(@Binval),iTally.n,1) = 1;

CREATE FUNCTION [dbo].[DecimalToCHARBits]

(@DecVal BIGINT)

RETURNS TABLE WITH SCHEMABINDING AS

/*

ChrisM March 2014

DecimalToCHARBits() could very easily be converted to work with 12 characters,

removing the present requirement for extracting the rightmost 12.

*/

RETURN

SELECT CHARBitsOut = CAST(REVERSE(

(SELECT (@DecVal/POWER(CAST(2 AS BIGINT),iTally.n-1)) % 2 AS [text()]

FROM (SELECT n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

FROM (VALUES

(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),

(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)

) iTally

FOR XML PATH(''))

) AS VARCHAR(32));

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden