March 11, 2010 at 2:39 pm
I've been programming for over 20 years. I confess that some of my programming habits are from when I began. The particular example I'm thinking of now is when I've needed to reflect multiple possible states of something, I'll define a field as a int and then use bit-OR's to put in all of the relevant possible states that apply to that item of that row. I'll use an illustration to make my point. In Unix all files, directories and so on have permissions applied to them. When I worked with Unix the permissions of files and folders was represented by an octal, something like 0x077. You knew what that meant, by applying a bit-AND to the file/folder's permission to determine what permissions were allowed. I don't know this for a fact, but I suspect that Windows uses very much the same idea; I would think that somewhere in Windows any file/folder's permissions is ultimately stored as an int somewhere, and that int is parsed to represent to the user whether the file has ReadOnly, etc, permissions applied.
I've been in my current job for almost 14 years, and I've used this paradigm to represent different states. Recently we had to develop a new system for handling telephone screenings. There aren't many tables involved. It was given to a junior programmer to work on, and I was assigned the task of helping her. Our supervisor wanted us to use the idea that I've just described, to track what we called "not eligible" reasons for screening. (It's a business term in our company.) He wanted a single table, which would list all possible "not eligible" reasons and then within the data table we would bit-OR all of the reasons and caller may have been found to be ineligible and store them into 1 field. Here is the 2 relevant table definitions with most of the fields removed so that only the relevant fields will be there:
--This is the primary table, which has data stored in it.
CREATE TABLE [dbo].[ScreenQuestionnaire](
[ScreenID] [int] IDENTITY(1,1) NOT NULL,
[ClientID] [int] NOT NULL,
[PhoneScreenDateTime] [datetime] NOT NULL CONSTRAINT
[DF_ScreenQuestionnaire_PhoneScreenDateTime] DEFAULT (getdate()),
[NotEligible] [int] NOT NULL CONSTRAINT
[DF_ScreenQuestionnaire_NotEligible] DEFAULT ((0)),
[ScreenerComments] [varchar](max) NULL,
[Screener] [varchar](50) NOT NULL,
[HasSAReferred] [bit] NULL,
[Insur_Exception] [bit] NULL,
CONSTRAINT [PK_ScreenQuestionnaire] PRIMARY KEY CLUSTERED
(
[ScreenID] ASC,
[ClientID] ASC
)
--This is the codes table, which comprise all possible codes that can
--be stored in ScreenQuestionnaire.NotEligible.
CREATE TABLE [dbo].[ScreenNotEligibleCodes](
[NotEligible] [int] NOT NULL,
[Description] [varchar](255) NOT NULL,
[StartEffectiveDate] [datetime] NOT NULL,
[EndEffectiveDate] [datetime] NOT NULL CONSTRAINT
[DF_ScreenNotEligibleCodes_EndEffectiveDate] DEFAULT (((2100)-(12))-(31)),
CONSTRAINT [PK_ScreenNotEligibleCodes] PRIMARY KEY CLUSTERED
(
[NotEligible] ASC
)
The ScreenNotEligibleCodes table has things like "Court Date Eminent" and "Caller Earnings too high" in it.
My issue is that, no matter how much I try, I cannot convey to the junior programmer the whole concept of what our past practice has been to do. At least I think that's the issue. You see, several weeks ago she wanted us to add a new column to the ScreenQuestionnaire table to cover the screening out condition of "Has SA Referral". I argued that it wasn't necessary, since that condition already existed in the ScreenNotEligibileCodes table and that all you had to do was get the value in the ScreenNotEligibileCodes table that represents "Has SA Referral" and perform a bit-AND. If the result is >0, then that caller was screened out due to "Has SA Referral". However, if you'll look at the ScreenQuestionnaire table's definition, you'll see that I've lost that argument, as there is a bit field there now which represents that value. In effect, we're storing the value twice; now in the bit field and also in the int field, as the value that comes from the ScreenNotEligibileCodes table. I think that some logic could be written, with the help of views, to help break out the meaning of individual "not eligible" screen outs for each ClientID and ScreenID. I also feel that doing bit-OR'ing of all possible screening out conditions allows us to be able to add new screen out conditions without having to change the table schema.
OK, basically I've given you my side of the story. I want to be fair and try and give hers. This application is written in ASP.NET and we use data binding. The simplest way to set, and save, data with the data bound controls on an ASP.NET page, is if there is 1 field in the table which represents the meaning of what you're trying to reflect on the page. One field for each possible screen out.
I hope I've been clear, and more importantly, I hope that I've fairly reflected my colleagues point of view. What I'd like advise on, from this group, is to know if maybe my method of approaching saving data is, well antiquated? I really think the issue, for my colleague is that she doesn't get bit arithmetic. But perhaps my approach is just old fashion. Certainly, the way I store data makes getting it out again and showing it to the end user, more complicated. Her approach is straight forward.
Kindest Regards, Rod Connect with me on LinkedIn.
March 11, 2010 at 2:52 pm
Bit conditions should be stored in bit columns. One field for each condition. Two reasons.
Storing multiple values in a single field is a violation of First Normal Form. All columns must store atomic values. Storing bits in an int is no different from storing comma-delimited lists in a varchar
If the bits are stored in an int column and calculated out, indexes on (or containing) that bit column are less useful.
It also makes updates harder. You can't simply say update set column = value. You've got to get the existing value, manipulate the new bit, then set it. Not to mention it's far harder to read when querying the DB directly. Sure, all access may be through your app now, that doesn't mean that a reporting system won't get added in a coupl months
Boolean logic's fine in front end applications, it has no place in a database. Single columns. Single values.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 11, 2010 at 3:08 pm
Gail,
I appreciate this feedback, and especially from you. This is the sort of thing that I've got to hear, so that I can see how better to architect solutions in the future.
If others have something to add, then please do so.
Kindest Regards, Rod Connect with me on LinkedIn.
March 11, 2010 at 9:58 pm
Gosh... the FNF thing notwithstanding, I guess I'd have to say "It Depends". BIT datatypes can't be aggregated (sometimes useful for getting a count using a SUM of items that have a BIT set) and will require an explicit conversion to do so. While it's inconvenient for humans to read, an INT with 8 bit-places can be indexed and effectively used if you want to find the whole number produced by bits set. On the other hand, it is convenient for humans to figure out what a bit in a column called IsActive means. On the other hand, in code, you can enumerate what each bit means using intelligently name variables, etc.
I have used BIT, TinyInt (even for just 0/1), and INT for "bits" of information for different reasons and different requirements. Each method has advantages and disadvantages.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply