June 29, 2013 at 3:54 pm
dear friends,
I have a table with too manly fields. there are duplicate records in in rows BUT ONLY for some fields.
I need to use DISTINCT on 4 columns and then return all rows.
or at least I should get ID value of records in result so I can real all columns with code.
lets say:
col1 col2 col3 col4 col5 col6
a1 AA BB a4 CC a6
b1 b2 b3 b4 CC b6
c1 AA BB c4 CC c6
result must be:
a1 AA BB a4 CC a6
b1 b2 b3 b4 CC b6
here's test code:
CREATE TABLE [dbo].[Table_2](
[ID] [int] IDENTITY(1,1) NOT NULL,
[col1] [nvarchar](50) NULL,
[col2] [nvarchar](50) NULL,
[col3] [nvarchar](50) NULL,
[col4] [nvarchar](50) NULL,
[col5] [nvarchar](50) NULL,
[col6] [nvarchar](50) NULL
) ON [PRIMARY]
insert into table_1 (col1,col2,col3,col4,col5,col6) values ('a1','AA','BB','a4','CC','a6')
insert into table_1 (col1,col2,col3,col4,col5,col6) values ('b1','b2','b3','b4','b5','b6')
insert into table_1 (col1,col2,col3,col4,col5,col6) values ('c1','AA','BB','c4','CC','c6')
June 29, 2013 at 5:06 pm
Your test preparation code yields errors (you create one table then insert to another).
If I understood you correctly, you need to identify duplicates by subset of columns,
and then display all that duplicate rows.
Grouping functions like COUNT can also be used with OVER() clause and have completely different effect.
Run this:
select *,
Duplicates = COUNT(*) OVER(partition by col2, col3, col5)
from Table_2
Result:
IDcol1col2col3col4col5col6Duplicates
1a1AABBa4CCa62
3c1AABBc4CCc62
2b1b2b3b4b5b61
So, the solution is:
SELECT ID, col1,col2,col3,col4,col5,col6
FROM
(select *,
Duplicates = COUNT(*) OVER(partition by col2, col3, col5)
from Table_2
) t
WHERE t.Duplicates > 1
Result:
IDcol1col2col3col4col5col6
1a1AABBa4CCa6
3c1AABBc4CCc6
Inline view is used because OVER() by definition executes just before ORDER BY, so you can't use it in e.g. WHERE because it doesn't exist yet.
HTH
June 29, 2013 at 5:10 pm
If you need just first row of each group:
SELECT ID, col1,col2,col3,col4,col5,col6
FROM
(select *,
Rnr = ROW_NUMBER() OVER(partition by col2, col3, col5 ORDER BY ID)
from Table_2
) t
WHERE t.Rnr = 1
Result:
IDcol1col2col3col4col5col6
1a1AABBa4CCa6
2b1b2b3b4b5b6
June 30, 2013 at 4:06 am
my friend, thats exactly what I need.
you helped me so much. thanks..
June 30, 2013 at 5:03 am
I'm glad to help, you are welcome.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply