How to find duplicate email addresses between 3 tables

  • Guys hi,

    i need your help on this.

    I have 3 tables accountbase, leadbase, and contact base.  Each of these tables has three email addresses. Accountbase table, has 2 columns accountname, emailaddress1. Leadbase table also has 2 columns. Companyname (similar to accountbase.name) and emailaddress1. Hence, contactbase table also has 2 columns, the Fullname, and emailaddress1. So far so good.

    I want to find the duplicate email addresses. By the term duplicate i mean that i want to find if an emaladdress belongs to more than 1 account (or lead, or contact), but also if an email address belongs to an account and a lead, or to an account and a contact, or to a lead and a contact. In other words not only duplicate emails between the same entity, but also duplicate emails between the 3 different entities. 

    The code i have written to find the duplicate email addresses between the same entity (in our example account) is listed here. The same code works to find duplicate emails betweem the entity lead, or the entity contacts.

    SELECT

     NAME

     , ACBASE.EMAILADDRESS1

     , COUNTEMAILS

    FROM

     DBO.ACCOUNTBASE ACBASE

     INNER JOIN  

      ( 

      SELECT

       EMAILADDRESS1,

       COUNT(EMAILADDRESS1) AS COUNTEMAILS

      FROM

       DBO.ACCOUNTBASE

      GROUP BY

       EMAILADDRESS1

      HAVING COUNT(EMAILADDRESS1)>1

     &nbsp A

      ON A.EMAILADDRESS1 = ACBASE.EMAILADDRESS1

    ORDER BY COUNTEMAILS DESC, ACBASE.EMAILADDRESS1

    As I said  the above code used also for leads and contacts i can find if two or more emails are between the SAME TABLE. And the union of the 3 similar queries, gives me results as regarding each entity. However, how can i find if an emailaddress, is contained also in the table  account, and/or in the table lead, and/or in the table contact?

    A smart code please! :-)))) 

     


    "If you want to get to the top, prepare to kiss alot of bottom"

  • I'm assuming it is SQL 2000.(no CTE

    SELECT A.Address, C = COUNT(*) INTO #T FROM

    (

    SELECT Name = accountname, Address = emailaddress1 FROM Accountbase

    UNION ALL SELECT Companyname, emailaddress1 FROM Leadbase

    UNION ALL SELECT Fullname, emailaddress1 FROM contactbase

    ) A

    GROUP BY A.Address

    HAVING COUNT(*) > 1

    SELECT TableType = 'A', Name = accountname, Address = emailaddress1 FROM Accountbase WHERE emailaddress1 IN (SELECT Z.Address FROM #T Z)

    UNION ALL SELECT 'L', Companyname, emailaddress1 FROM Leadbase WHERE emailaddress1 IN (SELECT Z.Address FROM #T Z)

    UNION ALL SELECT 'C', Fullname, emailaddress1 FROM contactbase WHERE emailaddress1 IN (SELECT Z.Address FROM #T Z)

    DROP TABLE #T

    K. Matsumura

  • Maybe this is better

    SELECT A.Address, B.accountname, B.emailaddress1, C.Companyname, C.emailaddress1, D.Fullname, D.emailaddress1

    FROM #T A

    LEFT OUTER JOIN Accountbase B ON B.emailaddress1 = A.Address

    LEFT OUTER JOIN Leadbase C ON C.emailaddress1 = A.Address

    LEFT OUTER JOIN contactbase D ON D.emailaddress1 = A.Address

    K. Matsumura

  • Thank you very much mate, let me check them, and get back with info! 🙂


    "If you want to get to the top, prepare to kiss alot of bottom"

  • I liked more the one previous the last you send! thank you very much! 🙂


    "If you want to get to the top, prepare to kiss alot of bottom"

  • Curious...  Why do you have 3 email tables?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This should do it, and it shows the breakout by type.

    select
     EMAIL_ADDRESS,
     ACCOUNT_COUNT= sum(ACCOUNT_COUNT),
     LEAD_COUNT= sum(LEAD_COUNT),
     CONTACT_COUNT= sum(CONTACT_COUNT),
     TOTAL_COUNT  = sum(ACCOUNT_COUNT+LEAD_COUNT+CONTACT_COUNT)
    from
     (
     select
      EMAIL_ADDRESS = EMAILADDRESS1,
      ACCOUNT_COUNT = count(*),
      LEAD_COUNT = 0,
      CONTACT_COUNT = 0 
     from
      ACCOUNTBASE
     group by
      EMAILADDRESS1
     union all
     select
      EMAIL_ADDRESS = EMAILADDRESS,
      ACCOUNT_COUNT = 0,
      LEAD_COUNT = count(*),
      CONTACT_COUNT = 0
     from
      LEADS
     group by
      EMAILADDRESS 
     union all
     select
      EMAIL_ADDRESS = EMAILADDRESS,
      ACCOUNT_COUNT = 0,
      LEAD_COUNT = 0,
      CONTACT_COUNT = count(*)
     from
      CONTACTS
     group by
      EMAILADDRESS
     ) a
    group by
     EMAIL_ADDRESS
    having
     sum(ACCOUNT_COUNT+LEAD_COUNT+CONTACT_COUNT) > 1
    order by
     EMAIL_ADDRESS
    

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply