How to compare data in customer table with other customers to find related cust

  • That is correct Phil, Yes, I change my mind about customer being in more than one group. based on it can you please help. Thank you

  • OK, then ordering of the tests becomes important. Please specify the order in which the tests should be applied.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Ok great. please  with email first then phone nos second  then addr&city (combined) last

     

    please let me know if you need any more info. Thank you

  • I mean in addr & city the combination has to match.

    also please not all columns values are nullable so some times there is missed values in fields

    Thank you Phil.

  • I've done the emails, the phone numbers and unmatched in the code that follows. You can add the rest yourself.

    --Rule 1 - E-mail match
    WITH AllEmails
    AS (SELECT c.Custno
    ,Email = c.email1
    FROM #Customers1 c
    UNION --Use instead of UNION ALL to make the results distinct
    SELECT c.Custno
    ,Email = c.email2
    FROM #Customers1 c)
    ,matches
    AS (SELECT a.Email
    ,ct = COUNT (*)
    FROM AllEmails a
    GROUP BY a.Email
    HAVING COUNT (*) > 1)
    INSERT #Grouped
    (
    GroupId
    ,CustomerId
    )
    SELECT GroupId = 1
    ,c.Custno
    FROM #Customers1 c
    CROSS JOIN matches m
    WHERE c.email1 = m.Email
    OR c.email2 = m.Email;

    --Rule 2 - any phone number matches
    WITH AllPhones
    AS (SELECT c.Custno
    ,Phone = c.Res_Phone
    FROM #Customers1 c
    UNION --Use instead of UNION ALL to make the results distinct
    SELECT c.Custno
    ,Phone = c.Bus_Phone
    FROM #Customers1 c
    UNION
    SELECT c.Custno
    ,Phone = c.Fax_Phone
    FROM #Customers1 c
    UNION
    SELECT c.Custno
    ,Phone = c.Marine_Phone
    FROM #Customers1 c
    UNION
    SELECT c.Custno
    ,Phone = c.Pager_Phone
    FROM #Customers1 c
    UNION
    SELECT c.Custno
    ,Phone = c.Other_Phone
    FROM #Customers1 c)
    ,matches
    AS (SELECT a.Phone
    ,ct = COUNT (*)
    FROM AllPhones a
    GROUP BY a.Phone
    HAVING COUNT (*) > 1)
    INSERT #Grouped
    (
    GroupId
    ,CustomerId
    )
    SELECT GroupId = 2
    ,c.Custno
    FROM #Customers1 c
    CROSS JOIN matches m
    WHERE(
    c.Res_Phone = m.Phone
    OR c.Bus_Phone = m.Phone
    OR c.Fax_Phone = m.Phone
    OR c.Marine_Phone = m.Phone
    OR c.Pager_Phone = m.Phone
    OR c.Other_Phone = m.Phone
    )
    AND NOT EXISTS
    (
    SELECT 1 FROM #Grouped g WHERE g.CustomerId = c.Custno
    );

    --Add other validation rules here

    --And finally, assign group numbers to any customers who did not match any of the rules
    INSERT #Grouped
    (
    GroupId
    ,CustomerId
    )
    --Edit the 10 in the following line to assign unused GroupIds
    SELECT GroupId = 10 + ROW_NUMBER () OVER (ORDER BY c.Custno)
    ,CustomerId = c.Custno
    FROM #Customers1 c
    WHERE NOT EXISTS
    (
    SELECT 1 FROM #Grouped g WHERE g.CustomerId = c.Custno
    );

    SELECT *
    FROM #Grouped g;

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Thank you Phil. You the best one quick Q for you, I have data about 1 mill plus so in this case any ideas how to reduce effect of it on processing? please

     

    Thank you again

    sita

  • Actually let me rephrase the question is there any way can I do with temp tables by splitting it each set I think union taking to much for my process for more than an hour and I am cancelling it 🙁 it would be good if I can break it up insert / update each step ? can you please split it if possible. Thank you very much Phil

  • Dear Phil Morning,

    Here I split the code loaded in temp tables but here in output I see groupid getting 1 for all customers.

     

    What I am trying to do is combined groupid (just a number) for only matched customers,

     

    for example cust 203454 has email1 as abc@gmail.com and cust 494033 has email2 as abc@gmaill.com then these two customers group id number is same even though customer no are different (as already exist)

    so expected output is

    CustID        GroupID

    203454         1001

    494033         1001

    ----------         1002

    ----------         1002

    ----------         1002

     

    here groupid 1002 had three customers that shares same email.

     

    Sorry for the confusion it may have caused in my initial request, can you please help

     

    Thank you

  • Are you saying that my code does not work this way?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Good Morning Phil.

    That is correct phil, it is assigning unique groupid for each customer. that has matching emails customers as well. Apologize for delay in response. Thank you for helping out

     

  • I am trying this one Phil, as I have 1 million some how the OR clause taking too much time

    ---------------------------------------------------

    DROP TABLE IF EXISTS #Grouped;

    DROP TABLE IF EXISTS #AllEmails;

    DROP TABLE IF EXISTS #Matches;

    CREATE TABLE #Grouped

    (

    GroupId INT NOT NULL,

    CustomerId varchar(50) NOT NULL,

    PRIMARY KEY CLUSTERED(GroupId, CustomerId)

    WITH(IGNORE_DUP_KEY = ON)

    );

    -- Create temporary table to store all emails

    CREATE TABLE #AllEmails (Custno varchar(50), Email VARCHAR(500));

    -- Populate the temporary table with emails from #Customers1

    INSERT INTO #AllEmails (Custno, Email)

    SELECT Custno, Email

    FROM #Customers1 WHERE EMAIL LIKE '%@%';

    INSERT INTO #AllEmails (Custno, Email)

    SELECT Custno, Email2

    FROM #Customers1 WHERE EMail2 LIKE '%@%';

    -- Rule 1 - E-mail match

    WITH Matches AS (

    SELECT Email, COUNT(*) AS ct

    FROM #AllEmails

    GROUP BY Email

    HAVING COUNT(*) > 1

    )

    INSERT INTO #Grouped (GroupId, CustomerId)

    SELECT 1, c.Custno

    FROM #Customers1 c

    JOIN Matches m ON c.Email = m.Email

    UNION

    SELECT 1, c.Custno

    FROM #Customers1 c

    JOIN Matches m ON c.Email2 = m.Email;

    -- Select data from #Grouped

    SELECT GroupId, count(*) FROM #Grouped where groupid <> 1 group by groupid order by 2 desc;

    INSERT #Grouped

    (

    GroupId

    ,CustomerId

    )

    --Edit the 10 in the following line to assign unused GroupIds

    SELECT GroupId = 10 + ROW_NUMBER () OVER (ORDER BY c.Custno)

    ,CustomerId = c.Custno

    FROM #Customers1 c

    WHERE NOT EXISTS

    (

    SELECT 1 FROM #Grouped g WHERE g.CustomerId = c.Custno

    );

    -- Drop temporary tables

    DROP TABLE IF EXISTS #AllEmails;

    DROP TABLE IF EXISTS #Matches;

    DROP TABLE IF EXISTS #Grouped;

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply