December 19, 2002 at 2:46 pm
Hi all,
I am generating a report for duplicate companies in a "Company" table. These are our customers and they're entered in by the sales people. Needless to say, the poorly written CRM app doesn't check for potential dups before letting people create new company profiles, and so all the sales people create tons of dups in the table. Now, I wonder if you guys have any suggestions on ways of doing this?
What I did, I create a temp table and inserted all the "Company_name" in there that have duplicates.
INSERT INTO #DupCompany
SELECT count(*), company_name
FROM company GROUP BY company_name
HAVING count(*) > 1
Next I can selet the company info by SELECTing FROM company WHERE company_id IN #DupCompany.
Finally, here's the problem ... I want to count the duplicate companies created per month. The trick here is though, the first instance of a company profile shouldn't be included, as it is not a duplicate yet. Any suggestions?
(Fields you may need: company_id, company_name, created_time.)
Thanks all.
December 19, 2002 at 3:56 pm
A solution might look something like what is given below. Not knowing your data makes it pretty hard to be sure if this is correct but from some test data that I generated you should get the result you are looking for.
select
substring(t1.CompanyName, 1, 20) as CompanyName
, datename(month, CreatedDT) as Month
, (
select count(*)
from Company t2
where
t2.CompanyName = t1.CompanyName
and t2.CreatedDT > min(t1.CreatedDT)
and datename(month, t2.CreatedDT) = datename(month, t1.CreatedDT)
) as NumberofDupes
from Company t1
group by
t1.CompanyName
, datename(month, CreatedDT)
Hope this helps. (Sorry for the appearance but when I paste from QA it always eliminates my tabs.)
David
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
December 19, 2002 at 4:03 pm
I think this fits your bill if not should be close.
SELECT company_name, datepart(mm,created_time) as dupmonth, count(*) as dupcnt
FROM company
LEFT JOIN
(select min(created_time) mct, company_name cn from company) as tblMins
ON
company_name = cn AND
created_time = mct
GROUP BY company_name, datepart(mm,created_time)
WHERE
cn IS NULL
You will not need the
HAVING count(*) > 1
as the query already eliminates the first created instance for all and only the duplicate instances are left.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply