Bitwise & NULL

  • Jeff Moden (12/15/2016)


    Sioban Krzywicki (12/15/2016)


    Jeff Moden (12/14/2016)


    Sioban Krzywicki (12/14/2016)


    What if the bitwise column never appears in the WHERE clause? What if all the values in the column are always returned?

    Who's going to read the column without knowing the key? No one, of course, unless they have esoteric knowledge and plenty of practice reading the column. And so you'll end up writing display code to bit-slice the column so you can apply a label to each bit for display purposes. Since up to 8 bits of the BIT datatype are stored in a single byte behind the scenes, there's usually no advantage to bit masking compared to having up to 8 separate bit columns that are clearly labeled with a meaningful column name.

    Many of these (this happens for a bunch of different tables, some tables aren't like this) are for internal, programmatic use only. As long as the code knows what to do with the result, that's all that matters.

    The main advantage is that we can add and remove new value types at will instead of adding and removing columns.

    However, you all have me thinking maybe the best solution for this is a many to many relationship with the bits in the interim table.

    You've hit upon 1 of 2 exceptions in my mind. The other one would be some form of device control code that you'd pass to the device.

    Even as in your case, though, I'm with Sergiy in that you're someday (probably not in the too distant future), you find the column with some bit-wise ANDs or ORs in the WHERE clause of queries for reporting purposes. At the very least, it's highly likely that you'll start finding such a thing in your SELECT lists. If the main thing is to avoid having to add columns, make the column a BIGINT so that you don't have to change from INT to BIGINT when they've finally gone over 32 bits.

    We just went through that with something "internal". And, yeah... it's in WHERE clauses and the whole gambit of slothfulness designed by users. It's as bad as an XML, JSON, or CSV column for the developers and worse for performance in a lot of cases. Though I loath it, at least XML can be indexed (if you want to call it that).

    I was intending to use a varbinary() type

    One of the main reasons I was considering this method is that Microsoft seems to use it quite a bit (ha).

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Pretty soon you'll see in queries "WHERE Status&8 >0".

    I guess I don't see why that's so much worse than a non-bit comparison of "status = 8" or "code1 = 23". I see the "shared-bits" column as just another technique that does add some flexibility when it's needed.

    For example, we've often used it for security flags in local apps. That way you can add a new security role without having nearly as much code.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (12/15/2016)


    Pretty soon you'll see in queries "WHERE Status&8 >0".

    I guess I don't see why that's so much worse than a non-bit comparison of "status = 8" or "code1 = 23". I see the "shared-bits" column as just another technique that does add some flexibility when it's needed.

    For example, we've often used it for security flags in local apps. That way you can add a new security role without having nearly as much code.

    Others mentioned above that

    WHERE Status & 8 > 0 isn't SARGable, which, for any sizable table, would make me avoid it.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Sioban Krzywicki (12/15/2016)


    ScottPletcher (12/15/2016)


    Pretty soon you'll see in queries "WHERE Status&8 >0".

    I guess I don't see why that's so much worse than a non-bit comparison of "status = 8" or "code1 = 23". I see the "shared-bits" column as just another technique that does add some flexibility when it's needed.

    For example, we've often used it for security flags in local apps. That way you can add a new security role without having nearly as much code.

    Others mentioned above that

    WHERE Status & 8 > 0 isn't SARGable, which, for any sizable table, would make me avoid it.

    Hmm, I think it's rather a long shot anyway that you'd be selecting so few rows via status bit checks that SQL would be able to accurately use a nonclustered index on that column anyway, and be willing to do so, given the sometimes very low % of rows required to use a nonclus index.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (12/15/2016)


    Pretty soon you'll see in queries "WHERE Status&8 >0".

    I guess I don't see why that's so much worse than a non-bit comparison of "status = 8" or "code1 = 23". I see the "shared-bits" column as just another technique that does add some flexibility when it's needed.

    For example, we've often used it for security flags in local apps. That way you can add a new security role without having nearly as much code.

    What did they tell you about using functions against columns in WHERE clause?

    _____________
    Code for TallyGenerator

  • Sergiy (12/15/2016)


    ScottPletcher (12/15/2016)


    Pretty soon you'll see in queries "WHERE Status&8 >0".

    I guess I don't see why that's so much worse than a non-bit comparison of "status = 8" or "code1 = 23". I see the "shared-bits" column as just another technique that does add some flexibility when it's needed.

    For example, we've often used it for security flags in local apps. That way you can add a new security role without having nearly as much code.

    What did they tell you about using functions against columns in WHERE clause?

    Ugh! I'm going through that with some developers at work. Worse than that, they have about 50 flag bits that they working with. They did it in a BIGINT as a matter of convenience so they wouldn't have 50 separate columns to add to the table and to simplify their WHERE clauses. Now the users that helped them with this foolish design are crabbing about performance. Since the column and related bit mask are frequently the sole criteria in their queries, <sarcasm> I wonder what the bloody problem could be </sarcasm>. :sick:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/26/2016)


    Sergiy (12/15/2016)


    ScottPletcher (12/15/2016)


    Pretty soon you'll see in queries "WHERE Status&8 >0".

    I guess I don't see why that's so much worse than a non-bit comparison of "status = 8" or "code1 = 23". I see the "shared-bits" column as just another technique that does add some flexibility when it's needed.

    For example, we've often used it for security flags in local apps. That way you can add a new security role without having nearly as much code.

    What did they tell you about using functions against columns in WHERE clause?

    Ugh! I'm going through that with some developers at work. Worse than that, they have about 50 flag bits that they working with. They did it in a BIGINT as a matter of convenience so they wouldn't have 50 separate columns to add to the table and to simplify their WHERE clauses. Now the users that helped them with this foolish design are crabbing about performance. Since the column and related bit mask are frequently the sole criteria in their queries, <sarcasm> I wonder what the bloody problem could be </sarcasm>. :sick:

    Why a BIGINT instead of, say a VARBINARY(8)? That's what I'm planning on using.

    I'm planning on offering 3 solutions & testing all 3 for performance. Only 1 is a bitmask solution.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Jeff Moden (12/26/2016)


    Sergiy (12/15/2016)


    ScottPletcher (12/15/2016)


    Pretty soon you'll see in queries "WHERE Status&8 >0".

    I guess I don't see why that's so much worse than a non-bit comparison of "status = 8" or "code1 = 23". I see the "shared-bits" column as just another technique that does add some flexibility when it's needed.

    For example, we've often used it for security flags in local apps. That way you can add a new security role without having nearly as much code.

    What did they tell you about using functions against columns in WHERE clause?

    Ugh! I'm going through that with some developers at work. Worse than that, they have about 50 flag bits that they working with. They did it in a BIGINT as a matter of convenience so they wouldn't have 50 separate columns to add to the table and to simplify their WHERE clauses. Now the users that helped them with this foolish design are crabbing about performance. Since the column and related bit mask are frequently the sole criteria in their queries, <sarcasm> I wonder what the bloody problem could be </sarcasm>. :sick:

    How would having 50 different bit flags, any/all of which could need checked for row selection, help performance in any way vs. a combined bigint column? It's extreme overhead to create indexes for even every combination of two of the bits with the others included, and there's no remote guarantee that SQL would use such an index anyway. If you made the bigger-value bits the more "must have" ones, you can restrict the bigint value as >= the least-large bit value, reducing the size of the index scan.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (12/27/2016)


    Jeff Moden (12/26/2016)


    Sergiy (12/15/2016)


    ScottPletcher (12/15/2016)


    Pretty soon you'll see in queries "WHERE Status&8 >0".

    I guess I don't see why that's so much worse than a non-bit comparison of "status = 8" or "code1 = 23". I see the "shared-bits" column as just another technique that does add some flexibility when it's needed.

    For example, we've often used it for security flags in local apps. That way you can add a new security role without having nearly as much code.

    What did they tell you about using functions against columns in WHERE clause?

    Ugh! I'm going through that with some developers at work. Worse than that, they have about 50 flag bits that they working with. They did it in a BIGINT as a matter of convenience so they wouldn't have 50 separate columns to add to the table and to simplify their WHERE clauses. Now the users that helped them with this foolish design are crabbing about performance. Since the column and related bit mask are frequently the sole criteria in their queries, <sarcasm> I wonder what the bloody problem could be </sarcasm>. :sick:

    How would having 50 different bit flags, any/all of which could need checked for row selection, help performance in any way vs. a combined bigint column? It's extreme overhead to create indexes for even every combination of two of the bits with the others included, and there's no remote guarantee that SQL would use such an index anyway. If you made the bigger-value bits the more "must have" ones, you can restrict the bigint value as >= the least-large bit value, reducing the size of the index scan.

    First, apologies. I was talking specifically about INSERT performance. You're mostly correct that SARGability and related performance usually won't matter for bit columns unless, like the folks tried at work, you actually do put an index on the bit-wise column and then wonder why INSERTs are dog slow as the page splits occur. And I'll agree its usually (won't say "always" but haven't run into it) stupid to put create an index with a leading bit or bitwise column for that very reason and the fact that, unless you have a big rowcount difference between the states, it usually won't help performance at all. That, notwithstanding, it greatly simplifies coding because the name of the column is the name of the bit. You don't have to trip off to some piece of code or a reference table to figure out what each bit is nor do you have to figure out which bits are represented by the number (for example) 5931.

    That, notwithstanding, I do understand why some people prefer the single bitwise column to multiple bit columns. I'm just not one of them.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/27/2016)


    ScottPletcher (12/27/2016)


    Jeff Moden (12/26/2016)


    Sergiy (12/15/2016)


    ScottPletcher (12/15/2016)


    Pretty soon you'll see in queries "WHERE Status&8 >0".

    I guess I don't see why that's so much worse than a non-bit comparison of "status = 8" or "code1 = 23". I see the "shared-bits" column as just another technique that does add some flexibility when it's needed.

    For example, we've often used it for security flags in local apps. That way you can add a new security role without having nearly as much code.

    What did they tell you about using functions against columns in WHERE clause?

    Ugh! I'm going through that with some developers at work. Worse than that, they have about 50 flag bits that they working with. They did it in a BIGINT as a matter of convenience so they wouldn't have 50 separate columns to add to the table and to simplify their WHERE clauses. Now the users that helped them with this foolish design are crabbing about performance. Since the column and related bit mask are frequently the sole criteria in their queries, <sarcasm> I wonder what the bloody problem could be </sarcasm>. :sick:

    How would having 50 different bit flags, any/all of which could need checked for row selection, help performance in any way vs. a combined bigint column? It's extreme overhead to create indexes for even every combination of two of the bits with the others included, and there's no remote guarantee that SQL would use such an index anyway. If you made the bigger-value bits the more "must have" ones, you can restrict the bigint value as >= the least-large bit value, reducing the size of the index scan.

    First, apologies. I was talking specifically about INSERT performance. You're mostly correct that SARGability and related performance usually won't matter for bit columns unless, like the folks tried at work, you actually do put an index on the bit-wise column and then wonder why INSERTs are dog slow as the page splits occur. And I'll agree its usually (won't say "always" but haven't run into it) stupid to put create an index with a leading bit or bitwise column for that very reason and the fact that, unless you have a big rowcount difference between the states, it usually won't help performance at all. That, notwithstanding, it greatly simplifies coding because the name of the column is the name of the bit. You don't have to trip off to some piece of code or a reference table to figure out what each bit is nor do you have to figure out which bits are represented by the number (for example) 5931.

    That, notwithstanding, I do understand why some people prefer the single bitwise column to multiple bit columns. I'm just not one of them.

    If you have an index on a bit column & do inserts, are the new rows placed randomly within the other rows with the same value (i.e. a 1 is placed at any point within all the bits of 1) or are they placed at the end of the group with that value?

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Sioban Krzywicki (12/27/2016)


    If you have an index on a bit column & do inserts, are the new rows placed randomly within the other rows with the same value (i.e. a 1 is placed at any point within all the bits of 1) or are they placed at the end of the group with that value?

    I don't know about what it does if you only index a bit column. Heh... I don't intend to find out, either. 😀 I do know, however, that if you have the leading column of a 3 column index where the other two columns were not bit columns, it makes for a pretty good page-splitting train wreck especially when you also make the mistake of using a 100% Fill Factor. I brought Expedia.com to its knees for about two minutes with just such an index about 7 years ago. It's a lesson I don't look forward to learning a second time although they did find out how well their load-balanced web servers worked under the backup I caused (and they auto-magically recovered very well as soon as I dropped the index).

    I did add a similar index the next day with the same columns but the bit column was the 3rd column that time.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 31 through 40 (of 40 total)

You must be logged in to reply to this topic. Login to reply