How to find duplicate email addresses between 3 tables

Question

Post reply

How to find duplicate email addresses between 3 tables

Dfalir

SSCrazy

Points: 2486
More actions
June 1, 2007 at 2:36 am

#116985

Guys hi,
i need your help on this.
I have 3 tables accountbase, leadbase, and contact base.  Each of these tables has three email addresses. Accountbase table, has 2 columns accountname, emailaddress1. Leadbase table also has 2 columns. Companyname (similar to accountbase.name) and emailaddress1. Hence, contactbase table also has 2 columns, the Fullname, and emailaddress1. So far so good.
I want to find the duplicate email addresses. By the term duplicate i mean that i want to find if an emaladdress belongs to more than 1 account (or lead, or contact), but also if an email address belongs to an account and a lead, or to an account and a contact, or to a lead and a contact. In other words not only duplicate emails between the same entity, but also duplicate emails between the 3 different entities.
The code i have written to find the duplicate email addresses between the same entity (in our example account) is listed here. The same code works to find duplicate emails betweem the entity lead, or the entity contacts.
SELECT
NAME
, ACBASE.EMAILADDRESS1
, COUNTEMAILS
FROM
DBO.ACCOUNTBASE ACBASE
INNER JOIN
  (
  SELECT
   EMAILADDRESS1,
   COUNT(EMAILADDRESS1) AS COUNTEMAILS
  FROM
   DBO.ACCOUNTBASE
  GROUP BY
   EMAILADDRESS1
  HAVING COUNT(EMAILADDRESS1)>1
&nbsp A
  ON A.EMAILADDRESS1 = ACBASE.EMAILADDRESS1
ORDER BY COUNTEMAILS DESC, ACBASE.EMAILADDRESS1
As I said the above code used also for leads and contacts i can find if two or more emails are between the SAME TABLE. And the union of the 3 similar queries, gives me results as regarding each entity. However, how can i find if an emailaddress, is contained also in the table account, and/or in the table lead, and/or in the table contact?
A smart code please! :-))))

"If you want to get to the top, prepare to kiss alot of bottom"

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Koji Matsumura SSCrazy Points: 2241 More actions · Answer 1

I'm assuming it is SQL 2000.(no CTE

SELECT A.Address, C = COUNT(*) INTO #T FROM

(

SELECT Name = accountname, Address = emailaddress1 FROM Accountbase

UNION ALL SELECT Companyname, emailaddress1 FROM Leadbase

UNION ALL SELECT Fullname, emailaddress1 FROM contactbase

) A

GROUP BY A.Address

HAVING COUNT(*) > 1

SELECT TableType = 'A', Name = accountname, Address = emailaddress1 FROM Accountbase WHERE emailaddress1 IN (SELECT Z.Address FROM #T Z)

UNION ALL SELECT 'L', Companyname, emailaddress1 FROM Leadbase WHERE emailaddress1 IN (SELECT Z.Address FROM #T Z)

UNION ALL SELECT 'C', Fullname, emailaddress1 FROM contactbase WHERE emailaddress1 IN (SELECT Z.Address FROM #T Z)

DROP TABLE #T

K. Matsumura

Koji Matsumura SSCrazy Points: 2241 More actions · Answer 2

Maybe this is better

SELECT A.Address, B.accountname, B.emailaddress1, C.Companyname, C.emailaddress1, D.Fullname, D.emailaddress1

FROM #T A

LEFT OUTER JOIN Accountbase B ON B.emailaddress1 = A.Address

LEFT OUTER JOIN Leadbase C ON C.emailaddress1 = A.Address

LEFT OUTER JOIN contactbase D ON D.emailaddress1 = A.Address

K. Matsumura

Dfalir SSCrazy Points: 2486 More actions · Answer 3

Thank you very much mate, let me check them, and get back with info! 🙂

"If you want to get to the top, prepare to kiss alot of bottom"

Dfalir SSCrazy Points: 2486 More actions · Answer 4

I liked more the one previous the last you send! thank you very much! 🙂

"If you want to get to the top, prepare to kiss alot of bottom"

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 5

Curious... Why do you have 3 email tables?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 6

This should do it, and it shows the breakout by type.

select
 EMAIL_ADDRESS,
 ACCOUNT_COUNT= sum(ACCOUNT_COUNT),
 LEAD_COUNT= sum(LEAD_COUNT),
 CONTACT_COUNT= sum(CONTACT_COUNT),
 TOTAL_COUNT  = sum(ACCOUNT_COUNT+LEAD_COUNT+CONTACT_COUNT)
from
 (
 select
  EMAIL_ADDRESS = EMAILADDRESS1,
  ACCOUNT_COUNT = count(*),
  LEAD_COUNT = 0,
  CONTACT_COUNT = 0 
 from
  ACCOUNTBASE
 group by
  EMAILADDRESS1
 union all
 select
  EMAIL_ADDRESS = EMAILADDRESS,
  ACCOUNT_COUNT = 0,
  LEAD_COUNT = count(*),
  CONTACT_COUNT = 0
 from
  LEADS
 group by
  EMAILADDRESS 
 union all
 select
  EMAIL_ADDRESS = EMAILADDRESS,
  ACCOUNT_COUNT = 0,
  LEAD_COUNT = 0,
  CONTACT_COUNT = count(*)
 from
  CONTACTS
 group by
  EMAILADDRESS
 ) a
group by
 EMAIL_ADDRESS
having
 sum(ACCOUNT_COUNT+LEAD_COUNT+CONTACT_COUNT) > 1
order by
 EMAIL_ADDRESS