Checksum vs Binary_Checksum

  • Can somebody explain the difference between checksum and binary_checksum and which one I should use? I was generating binary checksums for about 12 million rows, and the rows have 6 fields, and I had two rows which returned the exact same checksum result.... now I'm not sure what I should use, or should I use something outside of sql....

    select binary_checksum('71B60E5A-96FA-47AD-98C3-C05D4753C61200-14-22-e3-85-9ca83c3f933V07Z813V07Z81motherboard')

    select binary_checksum('509709D6-9590-4EF8-934F-8618779C079E00-22-69-35-6c-73407aa248LUS050B01883304D6B2535compmotherboard')

    HELP !

  • MR (4/8/2009)


    Can somebody explain the difference between checksum and binary_checksum and which one I should use?

    Use for what?

    _____________
    Code for TallyGenerator

  • I have a table with 6 fields, and given 6 values, I have to find the row that matches at least 4 values - 5 or 6 are ok, too - and if I have a row that matches 6, then I pick that over the row that matches 4. So I was matching them based on the possible combinations of 6 out of 6 and then 5 out of 6, and then 4 out 6. I was doing this based on taking the values and doing a comparison by doing a binary checksum. The combination of the values of the 6 fields in the table are unique. There is about 500K rows in this table and it is expected to grow another 500K in the next 6 months.... performance is very important.... but now I am seeing that checksums with different string values return the same result....and thats not going to work :-/...

  • MR (4/8/2009)


    I have a table with 6 fields, and given 6 values, I have to find the row that matches at least 4 values

    Which values?

    Some values given from outside application?

    Values from a row in the same table?

    _____________
    Code for TallyGenerator

  • Values from the application that will be sent to the DB to do the lookup....

  • Does the order matter?

    Can values be repeated in columns/variables?

    O it's like in a lotto: variables are the winning numbers and table contains all registered tickets?

    _____________
    Code for TallyGenerator

  • The order doesn't matter - we just need the matching values...

  • What about repeating values - is it possible in table and/or variables?

    _____________
    Code for TallyGenerator

  • The combination of the 6 values in the table is unique. In the values from the application, the values may be repeated, but there are 6 distinct parameters coming, and I know each one - and each one only maps to one field in the DB...

  • So, it's a lotto, or just like lotto, right?

    If so, and i "performance is very important" you need to take care about data normalization.

    CREATE TABLE Ticket (

    _____________
    Code for TallyGenerator

  • So, it's a lotto, or just like lotto, right?

    If so, and i "performance is very important" you need to take care about data normalization.

    -- Normalized data storage for registered tickets

    CREATE TABLE dbo.TicketNumbers (

    TicketID int NOT NULL,

    Number int NOT NULL, -- or varchar, or whatever else matches the type of your answers

    PRIMARY KEY (TicketID, Number),

    UNIQUE (Number, TicketID)

    )

    -- table for all "winning" numbers to have single point of access for all of them

    DECLARE @WinNumbers TABLE (

    Number int NOT NULL,

    PRIMARY KEY (Number)

    )

    INSERT INTO @WinNumbers (Number)

    SELECT @Var1

    UNION

    SELECT @Var2

    UNION

    SELECT @Var3

    UNION

    SELECT @Var4

    UNION

    SELECT @Var5

    UNION

    SELECT @Var6

    -- By using JOIN on Number columns we leave only those rows from each ticket which match "winning" numbers

    SELECT T.TicketID, COUNT(W.Number) as Matches

    FROM dbo.TicketNumbers T

    INNER JOIN @WinNumbers W ON T.Number = W.Number

    GROUP BY T.TicketID

    HAVING COUNT(W.Number) >= 4 -- or whatever threshold you've got

    ORDER BY COUNT(W.Number) DESC

    -- ORDER BY brings the tickets with max number of matches on top

    Don't add any other columns to dbo.TicketNumbers.

    All ticket profiles must be stored in a separate table with auto-incremental TicketID.

    You probably already have such table.

    The script is untested, please correct typos yourself.

    _____________
    Code for TallyGenerator

  • []

  • Only one of them. 🙂

    _____________
    Code for TallyGenerator

  • Do you know where @winNumbers gets stored when it is declared as 'TABLE' - I was actually under the impression that only worked in sql server 2005, but it does work in 2000 also - in which case, I would be interested in knowing the performance hit on writing the table to memory/tempDB, because I will get about 10/15 hits per second on this, and I have about 500K tickets....

  • It's not so hard to set up a test...

    _____________
    Code for TallyGenerator

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply