Indexing Bit Patterns

  • I have a situation where I am storing responses to careers advice questionaires.

    Questionaires consist of simple tick-box responses.

    A job (as in employment) will have the same questionaire filled in and where a job's questionaire matches an applicants questionaire then the applicant will be shown that job.

    I am storing responses in bit patterns as per Bruce Szabo's article http://www.sqlservercentral.com/columnists/bszabo/storingresponses.asp.

    i.e. Answer 1 = 1, Answer 2 = 2, Answer 3 = 4 etc. so someone who selects 1 and 3 will get a score of 5.

    I am happily bit matching away (XOR, NOT, AND, OR etc) but what I would like to know is whether there is a special way of indexing these scores to speed up the bit matching process?

    It is not so much the exact value matches, it is the "this bit pattern is a subset of that bit pattern".

  • This was removed by the editor as SPAM

  • Not that I know of. Just index the field as you normally would, hope for the best. Its an interesting technique used for a lot of things, we had to build a similar solution at work, opted to store one row per answer. Trade of course, but its decent to work with. Also, our typical access pattern is to pull all answers associated with a survey/question, occasionally for reporting we might filter by answer.

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • The only thing I can imagine is to have the interface determine the combination in question and query for that specific number. I don't know if you have that luxury, though.

    Good luck with it.

    Guarddata-

  • My theory was that since the bitwise AND operator tells you whether or not "this bit pattern is a subset of that bit pattern", an index on the INT field should perform well in this type of query:

    
    
    SELECT
    Field1, Field2
    FROM
    MyTable
    WHERE
    @BitArg = (@BitArg & MyBitField)

    To test this, I ran a couple iterations of the following script:

    
    
    --This is the setup script:
    SET NOCOUNT ON
    --
    DROP TABLE #Binary
    DROP TABLE #Compare
    --
    CREATE TABLE #Binary (BitField INT NOT NULL)
    CREATE TABLE #Compare (FlagField INT NOT NULL)
    --
    INSERT INTO #Binary VALUES (1)
    INSERT INTO #Binary VALUES (512)
    INSERT INTO #Binary VALUES (2048)
    INSERT INTO #Binary VALUES (7168)
    INSERT INTO #Binary VALUES (50176)
    --
    DECLARE @MAX_LOOP INT
    DECLARE @Loop INT
    --
    SET @MAX_LOOP = 10000
    SET @Loop = 1
    --
    WHILE @Loop < @MAX_LOOP BEGIN
    INSERT INTO #Compare VALUES (@Loop)
    SET @Loop = @Loop + 1
    END
    PRINT CONVERT(VARCHAR(12), @Loop) + ' records inserted...'
    SET NOCOUNT OFF
    
    
    --Here is the test script:
    --CREATE CLUSTERED INDEX ncx_Compare ON #Compare (FlagField)
    SET NOCOUNT ON
    --
    DECLARE @IsBitHere INT, @Printable VARCHAR(200), @CountFinds INT
    DECLARE c CURSOR FOR SELECT BitField FROM #Binary
    --
    OPEN c
    FETCH NEXT FROM c INTO @IsBitHere
    WHILE @@FETCH_STATUS <> -1 BEGIN
    /*
    SELECT
    FlagField,
    FlagField & @IsBitHere as "And",
    FlagField | @IsBitHere as "Or",
    FlagField ^ @IsBitHere as "xor"
    FROM #Compare
    */
    SELECT @CountFinds = COUNT(*) FROM #Compare WHERE @IsBitHere = (@IsBitHere & FlagField)
    PRINT 'Found ' + CONVERT(VARCHAR(10), @CountFinds) + ' for bitmask: ' + CONVERT(VARCHAR(10), @IsBitHere)
    FETCH NEXT FROM c INTO @IsBitHere
    END
    --
    CLOSE c
    DEALLOCATE c
    --
    SET NOCOUNT OFF

    I ran iterations using no index, a nonclustered index on FlagField, and a clustered index on FlagField. I was surprised to find that SQL Server declined to use the an index on any of the indexed iterations. You might say, well, SQL will match 5000 of the 10000 test records for the first bitmask (0x01), however looking at the execution plans, SQL Server only estimated finding 26 out of 10000 total records, and still did an index scan on both the nonclustered AND clustered indexes. My guess, is that it would be better, of course, if the FlagField part of the SARG was on the left side of the equation, but my feeble mind this morning couldn't figure out how to reverse the Boolean equation in the SARG (@IsBitHere = @IsBitHere & FlagField) to get FlagField on the left of the equation. Perhaps someone out there can do it...

    Cheers,

    jay

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply