Comparing if two sets of rows is different

  • Hi folks

    I need some more inspiration, on how I can compare a set of data with another set of data and returning the sets, which isn't identical.

    I have found two solutions, but there might be better ways of doing this. I need it to perform on large datasets.

    CREATE TABLE A (ID INT, Firstname SYSNAME, LastName SYSNAME, hashcode AS isnull(binary_checksum(Firstname,LastName),0))

    CREATE TABLE B (ID INT, Firstname SYSNAME, LastName SYSNAME, hashcode AS isnull(binary_checksum(Firstname,LastName),0))

    INSERT A VALUES (1, 'Michael', 'Sondergaard')

    INSERT A VALUES (1, 'Niels', 'Holger')

    INSERT A VALUES (1, 'Michael', 'Quist')

    INSERT B VALUES (1, 'Michael', 'Vestergaard')

    INSERT B VALUES (1, 'Niels', 'Holger')

    INSERT B VALUES (1, 'Michael', 'Quist')

    INSERT A VALUES (2, 'Ruben', 'Andersen')

    INSERT A VALUES (2, 'Martin', 'Buhl')

    INSERT A VALUES (2, 'Jesper', 'Jensen')

    INSERT B VALUES (2, 'Ruben', 'Andersen')

    INSERT B VALUES (2, 'Martin', 'Buhl')

    INSERT B VALUES (2, 'Jesper', 'Jensen')

    As you can see in this test. the set 1 in table A is different from set 1 in table B. Because of the last name for Michael. Set 2 is the same in both table A and B

    I have found the following solutions. Solutions 1 seems to be an non performer on large datasets However solution 2 have a theoretically possiblity of 2 sets being different, but reported as equal

    Solution #1

    SELECT ID

    FROM

    (

    SELECT * FROM A

    EXCEPT

    SELECT * FROM B

    UNION

    SELECT * FROM B

    EXCEPT

    SELECT * FROM A

    ) X

    Solution #2

    SELECT ID = X.ID

    FROM

    (SELECT ID, SetHashCode = CHECKSUM_AGG(HashCode) FROM A GROUP BY ID) X

    JOIN

    (SELECT ID, SetHashCode = CHECKSUM_AGG(HashCode) FROM B GROUP BY ID) Y ON

    X.ID = Y.ID AND

    X.SetHashCode != Y.SetHashCode

  • Duplicate post. No replies to this thread please. Direct replies to:http://www.sqlservercentral.com/Forums/Topic723118-338-1.aspx

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply