Comparing Two Tables and Comparing Within Each Result

  • Hi, 

    I've been stuck on this for a week now and was wondering if someone could help.  I've gotten this far and just can't get past this small issue. 

    The way our program works is that if a person was changed, it updates the record and also throws the original data into a new table.

    So in this example,  Jon Doe should not show up, because nothing really changed. One of his phone numbers got moved from one phone field to another. Or like Bob, either the same number was in more than one field and the person cleaned it up, or the opposite. They copied and pasted the same phone number into multiple fields!

    Jane should show up because her email changed. Jon and Bob should not, but in my query they are.

    I would really appreciate your help!!!

    CREATE TABLE #Source   (  ID varchar(50), nameFull varchar(50), salutation varchar(50), nameFirst varchar(50), nameLast varchar(50), addressLine1 varchar(50),   addressLine2 varchar(50), addressCity varchar(50), addressState varchar(50), addressZip varchar(50),   phone1full varchar(50), phone2full varchar(50), phone3full varchar(50), phone4full varchar(50), phone5full varchar(50), email1 varchar(50)  )  CREATE TABLE #Target  (pov_Prospect varchar(50),   pov_nameFull varchar(50), pov_salutation varchar(50), pov_nameFirst varchar(50), pov_nameLast varchar(50),   pov_addressLine1 varchar(50), pov_addressLine2 varchar(50), pov_addressCity varchar(50), pov_addressState varchar(50), pov_addressZip varchar(50),   pov_phone1full varchar(50), pov_phone2full varchar(50), pov_phone3full varchar(50), pov_phone4full varchar(50), pov_phone5full varchar(50), pov_email1 varchar(50)  ) 

    -- Jon  INSERT INTO #Source (ID, nameFull, salutation, nameFirst, nameLast, addressLine1, addressLine2, addressCity, addressState, addressZip,   phone1full, phone2full, phone3full, phone4full, phone5full, email1)  VALUES ('1', 'Jon Doe', 'Jon', 'Jon', 'Doe', '1234 West Elm', NULL, 'Chicago', 'IL', '45880',   '1112223333', NULL, '4445556666', NULL, NULL, 'jond@gmail.com');  INSERT INTO #Target  (pov_Prospect, pov_nameFull, pov_salutation, pov_nameFirst, pov_nameLast,   pov_addressLine1, pov_addressLine2, pov_addressCity, pov_addressState, pov_addressZip,   pov_phone1full, pov_phone2full, pov_phone3full, pov_phone4full, pov_phone5full, pov_email1)  VALUES ('1', 'Jon Doe', 'Jon', 'Jon', 'Doe', '1234 West Elm', NULL, 'Chicago', 'IL', '45880',   '1112223333', '4445556666', NULL, NULL, NULL, 'jond@gmail.com'); 

    --Jane  INSERT INTO #Source (ID, nameFull, salutation, nameFirst, nameLast, addressLine1, addressLine2, addressCity, addressState, addressZip,   phone1full, phone2full, phone3full, phone4full, phone5full, email1)  VALUES ('2', 'Jane Smith', 'Jane', 'Jane', 'Smith', '111 East Grove', 'Suite 101', 'Denver', 'CO', '75365',   '8882223333', NULL, NULL, NULL, NULL, 'janes@gmail.com');  INSERT INTO #Target (pov_Prospect, pov_nameFull, pov_salutation, pov_nameFirst, pov_nameLast,   pov_addressLine1, pov_addressLine2, pov_addressCity, pov_addressState, pov_addressZip,   pov_phone1full, pov_phone2full, pov_phone3full, pov_phone4full, pov_phone5full, pov_email1)  VALUES ('2', 'Jane Smith', 'Jane', 'Jane', 'Smith', '111 East Grove', 'Suite 101', 'Denver', 'CO', '75365',   '8882223333', NULL, NULL, NULL, NULL, 'janeR@gmail.com'); 

    --Bob - added after the fact...I didn't know there were so many like Bob  INSERT INTO #Source (ID, nameFull, salutation, nameFirst, nameLast, addressLine1, addressLine2, addressCity, addressState, addressZip,   phone1full, phone2full, phone3full, phone4full, phone5full, email1)  VALUES ('3', 'Bob Doe', 'Bob', 'Bob', 'Doe', '1234 West Elm', NULL, 'Chicago', 'IL', '45880',   '111', NULL, '111', NULL, NULL, 'Bobd@gmail.com');  INSERT INTO #Target  (pov_Prospect, pov_nameFull, pov_salutation, pov_nameFirst, pov_nameLast,   pov_addressLine1, pov_addressLine2, pov_addressCity, pov_addressState, pov_addressZip,   pov_phone1full, pov_phone2full, pov_phone3full, pov_phone4full, pov_phone5full, pov_email1)  VALUES ('3', 'Bob Doe', 'Bob', 'Bob', 'Doe', '1234 West Elm', NULL, 'Chicago', 'IL', '45880',   '111', NULL, NULL, NULL, NULL, 'Bobd@gmail.com');

    ------------------------------------------------------------------------------------------------------------------------------------------------------------------  
    ;with data as (
      SELECT
        s.id,
        ISNULL(s.nameFull, '') nameFull ,
        ISNULL(s.salutation, '') salutation ,
        ISNULL(s.nameFirst, '') nameFirst ,
        ISNULL(s.nameLast, '') nameLast ,
        ISNULL(s.addressLine1, '') addressLine1 ,
        ISNULL(s.addressLine2, '') addressLine2 ,
        ISNULL(s.addressCity, '') addressCity ,
        ISNULL(s.addressState, '') addressState ,
        ISNULL(s.addressZip, '') addressZip ,
        ISNULL(s.phone1full, '') phone1full ,
        ISNULL(s.phone2full, '') phone2full ,
        ISNULL(s.phone3full, '') phone3full ,
        ISNULL(s.phone4full, '') phone4full ,
        ISNULL(s.phone5full, '') phone5full ,
        ISNULL(s.email1, '') email1 ,
        ISNULL(t.pov_nameFull, '') pov_nameFull ,
        ISNULL(t.pov_salutation, '') pov_salutation ,
        ISNULL(t.pov_nameFirst, '') pov_nameFirst ,
        ISNULL(t.pov_nameLast, '') pov_nameLast ,
        ISNULL(t.pov_addressLine1, '') pov_addressLine1 ,
        ISNULL(t.pov_addressLine2, '') pov_addressLine2 ,
        ISNULL(t.pov_addressCity, '') pov_addressCity ,
        ISNULL(t.pov_addressState, '') pov_addressState ,
        ISNULL(t.pov_addressZip, '') pov_addressZip ,
        ISNULL(t.pov_phone1full, '') pov_phone1full ,
        ISNULL(t.pov_phone2full, '') pov_phone2full ,
        ISNULL(t.pov_phone3full, '') pov_phone3full ,
        ISNULL(t.pov_phone4full, '') pov_phone4full ,
        ISNULL(t.pov_phone5full, '') pov_phone5full ,
        ISNULL(t.pov_email1, '') pov_email1

      FROM [#Source]
       INNER JOIN [#Target] [T]
       ON .[ID] = [T].[pov_Prospect]
      WHERE
         ISNULL( s.nameFull, '') <> ISNULL( t.pov_nameFull, '') OR
        ISNULL( s.salutation, '') <> ISNULL( t.pov_salutation, '') OR
        ISNULL( s.nameFirst, '') <> ISNULL( t.pov_nameFirst, '') OR
        ISNULL( s.nameLast, '') <> ISNULL( t.pov_nameLast, '') OR
        ISNULL( s.addressLine1, '') <> ISNULL( t.pov_addressLine1, '') OR
        ISNULL( s.addressLine2, '') <> ISNULL( t.pov_addressLine2, '') OR
        ISNULL( s.addressCity, '') <> ISNULL( t.pov_addressCity, '') OR
        ISNULL( s.addressState, '') <> ISNULL( t.pov_addressState, '') OR
        ISNULL( s.addressZip, '') <> ISNULL( t.pov_addressZip, '') OR
        ISNULL( s.phone1full, '') <> ISNULL( t.pov_phone1full, '') OR
        ISNULL( s.phone2full, '') <> ISNULL( t.pov_phone2full, '') OR
        ISNULL( s.phone3full, '') <> ISNULL( t.pov_phone3full, '') OR
        ISNULL( s.phone4full, '') <> ISNULL( t.pov_phone4full, '') OR
        ISNULL( s.phone5full, '') <> ISNULL( t.pov_phone5full, '') OR
        ISNULL( s.email1, '') <> ISNULL( t.pov_email1, '')

        OR (s.phone1full <> t.pov_phone1full AND s.phone1full <> t.pov_phone2full AND s.phone1full <> t.pov_phone3full AND s.phone1full <> t.pov_phone4full AND s.phone1full <> t.pov_phone5full )
        OR (s.phone2full <> t.pov_phone1full AND s.phone2full <> t.pov_phone2full AND s.phone2full <> t.pov_phone3full AND s.phone2full <> t.pov_phone4full AND s.phone2full <> t.pov_phone5full )
        OR (s.phone3full <> t.pov_phone1full AND s.phone3full <> t.pov_phone2full AND s.phone3full <> t.pov_phone3full AND s.phone3full <> t.pov_phone4full AND s.phone3full <> t.pov_phone5full )
        OR (s.phone4full <> t.pov_phone1full AND s.phone4full <> t.pov_phone2full AND s.phone4full <> t.pov_phone3full AND s.phone4full <> t.pov_phone4full AND s.phone4full <> t.pov_phone5full )
        OR (s.phone5full <> t.pov_phone1full AND s.phone5full <> t.pov_phone2full AND s.phone5full <> t.pov_phone3full AND s.phone5full <> t.pov_phone4full AND s.phone5full <> t.pov_phone5full )
      )

      , fieldsToRows
       AS (SELECT
          id,
          field,
          Value

         FROM data p UNPIVOT
             (value FOR field IN (nameFull, pov_nameFull,
                      salutation, pov_salutation,
                      nameFirst, pov_nameFirst,
                      nameLast, pov_nameLast,
                      addressLine1, pov_addressLine1,
                      addressLine2, pov_addressLine2,
                      addressCity, pov_addressCity,
                      addressState, pov_addressState,
                      addressZip, pov_addressZip,
                      phone1full, pov_phone1full,
                      phone2full, pov_phone2full,
                      phone3full, pov_phone3full,
                      phone4full, pov_phone4full,
                      phone5full, pov_phone5full,
                      email1, pov_email1))
           AS unpvt
        )
        SELECT
           curr.id,
           curr.field,
           curr.value new_value,
           prev.value old_value,
           CASE WHEN curr.value <> prev.value THEN 'Yes' END Changed

      FROM
        fieldsToRows curr
        INNER JOIN fieldsToRows prev
        ON curr.ID = prev.id
          AND curr.field = 'pov_' + prev.field

  • That's one of the difficulties of having multiple columns for the same kind of data in a given row.   If I had this I'd look to re-design it to have a separate table for the phone numbers.   However, that's for another day.   We can develop a query that will expand the data out to one record per phone number, and then compare that information.   How it performs will be dependent on indexing, but I don't have time this afternoon, so if you can start down that road and post back what you come up, we'll be halfway there.   I'll try to check back tomorrow.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Thanks. Unfortunately I don't have control over the data, or access to the program code. I'm not sure I'm understanding what you're saying but I'll start looking into it.  Thanks.  I really appreciate your help. I'm really stuck.

  • Take a look at this and let me know what you think:CREATE TABLE #Source (
        ID varchar(50) NOT NULL PRIMARY KEY CLUSTERED,
        nameFull varchar(50),
        salutation varchar(50),
        nameFirst varchar(50),
        nameLast varchar(50),
        addressLine1 varchar(50),
        addressLine2 varchar(50),
        addressCity varchar(50),
        addressState varchar(50),
        addressZip varchar(50),
        phone1full varchar(50),
        phone2full varchar(50),
        phone3full varchar(50),
        phone4full varchar(50),
        phone5full varchar(50),
        email1 varchar(50)
    );
    CREATE TABLE #Target (
        pov_Prospect varchar(50) NOT NULL PRIMARY KEY CLUSTERED,
        pov_nameFull varchar(50),
        pov_salutation varchar(50),
        pov_nameFirst varchar(50),
        pov_nameLast varchar(50),
        pov_addressLine1 varchar(50),
        pov_addressLine2 varchar(50),
        pov_addressCity varchar(50),
        pov_addressState varchar(50),
        pov_addressZip varchar(50),
        pov_phone1full varchar(50),
        pov_phone2full varchar(50),
        pov_phone3full varchar(50),
        pov_phone4full varchar(50),
        pov_phone5full varchar(50),
        pov_email1 varchar(50)
    );
    INSERT INTO #Source (ID, nameFull, salutation, nameFirst, nameLast, addressLine1, addressLine2, addressCity, addressState, addressZip, phone1full, phone2full, phone3full, phone4full, phone5full, email1)
        -- Jon
        VALUES    ('1', 'Jon Doe', 'Jon', 'Jon', 'Doe', '1234 West Elm', NULL, 'Chicago', 'IL', '45880', '1112223333', NULL, '4445556666', NULL, NULL, 'jond@gmail.com'),
        -- Jane
                ('2', 'Jane Smith', 'Jane', 'Jane', 'Smith', '111 East Grove', 'Suite 101', 'Denver', 'CO', '75365', '8882223333', NULL, NULL, NULL, NULL, 'janes@gmail.com'),
        --Bob - added after the fact...I didn't know there were so many like Bob
                ('3', 'Bob Doe', 'Bob', 'Bob', 'Doe', '1234 West Elm', NULL, 'Chicago', 'IL', '45880', '111', NULL, '111', NULL, NULL, 'Bobd@gmail.com');

    INSERT INTO #Target (pov_Prospect, pov_nameFull, pov_salutation, pov_nameFirst, pov_nameLast, pov_addressLine1, pov_addressLine2, pov_addressCity, pov_addressState, pov_addressZip, pov_phone1full, pov_phone2full, pov_phone3full, pov_phone4full, pov_phone5full, pov_email1)
        -- Jon
        VALUES    ('1', 'Jon Doe', 'Jon', 'Jon', 'Doe', '1234 West Elm', NULL, 'Chicago', 'IL', '45880', '1112223333', '4445556666', NULL, NULL, NULL, 'jond@gmail.com'),
        -- Jane
                ('2', 'Jane Smith', 'Jane', 'Jane', 'Smith', '111 East Grove', 'Suite 101', 'Denver', 'CO', '75365', '8882223333', NULL, NULL, NULL, NULL, 'janeR@gmail.com'),
        --Bob - added after the fact...I didn't know there were so many like Bob
                ('3', 'Bob Doe', 'Bob', 'Bob', 'Doe', '1234 West Elm', NULL, 'Chicago', 'IL', '45880', '111', NULL, NULL, NULL, NULL, 'Bobd@gmail.com');

    ------------------------------------------------------------------------------------------------------------------------------------------------------------------
    WITH ROW_NUMBERS AS (

        SELECT 1 AS RN
        UNION ALL
        SELECT 2
        UNION ALL
        SELECT 3
        UNION ALL
        SELECT 4
        UNION ALL
        SELECT 5
    ),
        EXPANDED_SOURCE AS (

            SELECT S.ID,
                S.nameFull,
                S.salutation,
                S.nameFirst,
                S.nameLast,
                S.addressLine1,
                S.addressLine2,
                S.addressCity,
                S.addressState,
                S.addressZip,
                MAX(
                    CASE R.RN
                        WHEN 1 THEN S.phone1full
                        WHEN 2 THEN S.phone2full
                        WHEN 3 THEN S.phone3full
                        WHEN 4 THEN S.phone4full
                        WHEN 5 THEN S.phone5full
                    END
                    ) AS PhoneNumber,
                S.email1
            FROM #Source AS S
                CROSS APPLY ROW_NUMBERS AS R
            WHERE
                CASE R.RN
                    WHEN 1 THEN S.phone1full
                    WHEN 2 THEN S.phone2full
                    WHEN 3 THEN S.phone3full
                    WHEN 4 THEN S.phone4full
                    WHEN 5 THEN S.phone5full
                END IS NOT NULL
            GROUP BY S.ID,
                S.nameFull,
                S.salutation,
                S.nameFirst,
                S.nameLast,
                S.addressLine1,
                S.addressLine2,
                S.addressCity,
                S.addressState,
                S.addressZip,
                S.email1,
                CASE R.RN
                    WHEN 1 THEN S.phone1full
                    WHEN 2 THEN S.phone2full
                    WHEN 3 THEN S.phone3full
                    WHEN 4 THEN S.phone4full
                    WHEN 5 THEN S.phone5full
                END
    ),
        EXPANDED_TARGET AS (

            SELECT T.pov_Prospect,
                T.pov_nameFull,
                T.pov_salutation,
                T.pov_nameFirst,
                T.pov_nameLast,
                T.pov_addressLine1,
                T.pov_addressLine2,
                T.pov_addressCity,
                T.pov_addressState,
                T.pov_addressZip,
                MAX(
                    CASE R.RN
                        WHEN 1 THEN T.pov_phone1full
                        WHEN 2 THEN T.pov_phone2full
                        WHEN 3 THEN T.pov_phone3full
                        WHEN 4 THEN T.pov_phone4full
                        WHEN 5 THEN T.pov_phone5full
                    END
                    ) AS PhoneNumber,
                T.pov_email1
            FROM #Target AS T
                CROSS APPLY ROW_NUMBERS AS R
            WHERE
                CASE R.RN
                    WHEN 1 THEN T.pov_phone1full
                    WHEN 2 THEN T.pov_phone2full
                    WHEN 3 THEN T.pov_phone3full
                    WHEN 4 THEN T.pov_phone4full
                    WHEN 5 THEN T.pov_phone5full
                END IS NOT NULL
            GROUP BY T.pov_Prospect,
                T.pov_nameFull,
                T.pov_salutation,
                T.pov_nameFirst,
                T.pov_nameLast,
                T.pov_addressLine1,
                T.pov_addressLine2,
                T.pov_addressCity,
                T.pov_addressState,
                T.pov_addressZip,
                T.pov_email1,
                CASE R.RN
                    WHEN 1 THEN T.pov_phone1full
                    WHEN 2 THEN T.pov_phone2full
                    WHEN 3 THEN T.pov_phone3full
                    WHEN 4 THEN T.pov_phone4full
                    WHEN 5 THEN T.pov_phone5full
                END
    )
    SELECT
        CASE
            WHEN S.ID = T.pov_Prospect
                AND ISNULL(S.nameFull, '') = ISNULL(T.pov_nameFull, '')
                AND ISNULL(S.salutation, '') = ISNULL(T.pov_salutation, '')
                AND ISNULL(S.nameFirst, '') = ISNULL(T.pov_nameFirst, '')
                AND ISNULL(S.nameLast, '') = ISNULL(T.pov_nameLast, '')
                AND ISNULL(S.addressLine1, '') = ISNULL(T.pov_addressLine1, '')
                AND ISNULL(S.addressLine2, '') = ISNULL(T.pov_addressLine2, '')
                AND ISNULL(S.addressCity, '') = ISNULL(T.pov_addressCity, '')
                AND ISNULL(S.addressState, '') = ISNULL(T.pov_addressState, '')
                AND ISNULL(S.addressZip, '') = ISNULL(T.pov_addressZip, '')
                AND S.PhoneNumber = T.PhoneNumber
                AND ISNULL(S.email1, '') = ISNULL(T.pov_email1, '')
                    THEN 'MATCH'
            ELSE 'NON-MATCH'
        END AS MATCH,
        S.ID,
        S.nameFull,
        S.salutation,
        S.nameFirst,
        S.nameLast,
        S.addressLine1,
        S.addressLine2,
        S.addressCity,
        S.addressState,
        S.addressZip,
        S.PhoneNumber,
        S.email1,
        T.pov_Prospect,
        T.pov_nameFull,
        T.pov_salutation,
        T.pov_nameFirst,
        T.pov_nameLast,
        T.pov_addressLine1,
        T.pov_addressLine2,
        T.pov_addressCity,
        T.pov_addressState,
        T.pov_addressZip,
        T.PhoneNumber,
        T.pov_email1
    FROM EXPANDED_SOURCE AS S
        FULL OUTER JOIN EXPANDED_TARGET AS T
            ON S.ID = T.pov_Prospect
            AND S.PhoneNumber = T.PhoneNumber
    ORDER BY ISNULL(S.ID, T.pov_Prospect), ISNULL(S.PhoneNumber, T.PhoneNumber);

    DROP TABLE #Source;
    DROP TABLE #Target;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Thank you so much for your help! I think this is close.  The only thing I see is that it's not indicating what field changed and that it needs to be pivoted, which isn't a big deal.  I'll tinker with it some and get back to you.  Thanks again, I really appreciate it.

  • NKKN - Wednesday, March 28, 2018 3:16 PM

    Thank you so much for your help! I think this is close.  The only thing I see is that it's not indicating what field changed and that it needs to be pivoted, which isn't a big deal.  I'll tinker with it some and get back to you.  Thanks again, I really appreciate it.

    Glad I could help.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply