November 19, 2010 at 6:55 am
I want to select duplicate rows based on the column 'companyname' into a table.
This statement works:
SELECT p.companyname, Count(*) - 1
FROM prospects p
GROUP BY p.companyname
HAVING Count(*) > 1
BUT, I need to know the id belonging to the duplicate row. So I added the "id" column:
DECLARE @prosp TABLE (
id int
,companyname nvarchar(50)
,cnt int
)
INSERT INTO @prosp(id,companyname,cnt)
SELECT p.id,p.companyname, Count(*) - 1
FROM prospects p
GROUP BY p.id,p.companyname
HAVING Count(*) > 1
select * from @prosp
But in the latter case, NO records are found (ofcourse) since the id is the PK and thus never duplicate.
How can I get this scenario to work?
November 19, 2010 at 7:26 am
See if this works for you
INSERT INTO @prosp(Id,companyname,cnt)
SELECT
P.Id,
P.companyname,
P.cnt
FROM
prospects P
INNER JOIN
(
SELECT
T1.companyname,
Count(*) - 1
FROM
prospects T1
GROUP BY
T1.companyname
HAVING Count(*) > 1
) AS DQ1
ON
P.companyname = DQ1.companyname
select * from @prosp
Scott
November 19, 2010 at 7:38 am
It does...thx!
November 19, 2010 at 8:22 am
Ok, sorry, but I do have one more question.
I need to add a calculated column to the temp table: whether the selected records occur in another table 'prospectsincampaigns'.
I now have this:
DECLARE @prosp TABLE (
id int
,companyname nvarchar(50)
,isincampaign bit
,cnt int
)
INSERT INTO @prosp(Id,companyname,cnt)
SELECT
P.Id
,P.companyname
,isincampaign
,cnt
FROM prospects P
INNER JOIN
(
SELECT T1.companyname
,(Count(*) - 1) as cnt
,(select COUNT(prospectid) from prospectsincampaigns where prospectid=P.Id)>0 as isincampaign
FROM
prospects T1
GROUP BY
T1.companyname
HAVING Count(*) > 1
) AS DQ1
ON
P.companyname = DQ1.companyname
select * from @prosp
As you see, I have this line:
,(select COUNT(prospectid) from prospectsincampaigns where prospectid=P.Id)>0 as isincampaign
What I want here, is a bit value that indicates if this value occurs in that other table.
Besides the facts that the statement syntax above is incorrect, im also unable to add a calculated column to the temp table.
If adding the bit value is not possible, I would also settle for a value that indicates the amount (int) of times the id occurs in the other table..
Help is really appreciated (again 🙂
November 19, 2010 at 8:29 am
This is a sample of how to do this
--= first set up some test data
DECLARE @prospects TABLE (
id int identity
,companyname nvarchar(50)
)
insert @prospects( companyname)
select 'test1' union all
select 'test1' union all
select 'test1' union all
select 'test2' union all
select 'test2' union all
select 'test3' union all
select 'test4' union all
select 'test5' union all
select 'test6' union all
select 'test7' union all
select 'test8'
--= now use a CTE and ROW_NUMBER() to count the repeating entries
;with cte as
(
select
id
,companyname
,ROW_NUMBER() over(partition by companyname order by id) as counter
from @prospects
)
select
P.id
,P.companyname
,P.counter
,COALESCE(PIC.flag,'Is NOT in prospectsincampaigns') as picflag
from cte as P
outer apply (
select TOP 1 'Is in prospectsincampaigns'
from prospectsincampaigns
where prospectid=P.Id
) PIC(flag)
where counter >1 --== and select only the ones that are duplicates (assuming the first one found for each companyname is not a duplicate)
Edit: added external lookup to prospectsincampaigns
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
November 19, 2010 at 9:20 am
@magoo: I think you were replying to an earlier post of me and not to my latest?
November 19, 2010 at 9:51 am
No, the whole thing, to find duplicates in the prospects table (I used a table variable for this table as a test) and show whether the duplicates are also in prospectsincampaigns ...
If I have misunderstood the requirement, then I apologise.
The method I am showing does not use a temporary table and will only read Prospects once.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
November 19, 2010 at 10:12 am
Ah, I didnt see all the code, my bad.
Anyway, your code seems to do [almost] what I need.
Currently your code shows: #duplicate records - 1, meaning that if I have 5 companies named 'TEST', it will now show only 4 records.
I need that all 5 records are shown, since all of them are duplicate.
how would I achieve that?
November 19, 2010 at 11:50 am
petervdkerk (11/19/2010)
Ah, I didnt see all the code, my bad.Anyway, your code seems to do [almost] what I need.
Currently your code shows: #duplicate records - 1, meaning that if I have 5 companies named 'TEST', it will now show only 4 records.
I need that all 5 records are shown, since all of them are duplicate.
how would I achieve that?
Just change the code
,ROW_NUMBER() over(partition by companyname order by id) as counter
to
,SUM(1) over(partition by companyname) as counter
November 19, 2010 at 12:11 pm
shump66 (11/19/2010)
petervdkerk (11/19/2010)
Ah, I didnt see all the code, my bad.Anyway, your code seems to do [almost] what I need.
Currently your code shows: #duplicate records - 1, meaning that if I have 5 companies named 'TEST', it will now show only 4 records.
I need that all 5 records are shown, since all of them are duplicate.
how would I achieve that?
Just change the code
,ROW_NUMBER() over(partition by companyname order by id) as counter
to
,SUM(1) over(partition by companyname) as counter
Exactly, thanks shump66! I was in the car at the time, so couldn't reply.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
November 24, 2010 at 6:19 am
Ok, thanks both you guys. And what if I want to select this result into a temp table?
I now have (not working):
DECLARE @prosp TABLE (
id int
,companyname nvarchar(50)
,isincampaign bit
,cnt int
,incamp nvarchar(50)
)
with cte as
(
select id,companyname,SUM(1) over(partition by companyname) as counter from prospects
)
INSERT INTO @prosp(Id,companyname,cnt,incamp)
select
P.id
,P.companyname
,P.counter
,COALESCE(PIC.flag,'0') as incamp
from cte as P
outer apply (
select TOP 1 '1'
from prospectsincampaigns
where prospectid=P.Id
) PIC(flag)
Thanks again! 🙂
November 24, 2010 at 7:12 am
It helps if you tell people the error you get...
Which bit is not working - the select or the insert (try it without the insert)
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
November 24, 2010 at 9:28 am
Forgot a semicolon :$ code works fine 🙂 thanks!
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply