December 2, 2009 at 3:33 pm
Good Afternoon,
I am working with a dataset that has duplicate ID's due to unique records. The challenge is any number of column(s) can cause the uniqueness therefore there is no way to issolate a single column. The goal is to drop all dups but one and it doesnt matter which records is left.
Below is a sample dataset.
Thanks,
CREATE TABLE [dbo].[DupsDueToUniqueRows](
[ID] [int] NULL,
[Name] [nvarchar](30) NULL,
[Contact] [nvarchar] (30) NULL,
[Status] [char] (1) NULL
) ON [PRIMARY]
INSERT INTO [DupsDueToUniqueRows] ([ID], [Name], [Contact], [Status])
SELECT 1, 'Jon' ,'Mobile', 'A'
UNION ALL
SELECT 1, 'Jon' ,'Mobile', 'B'
UNION ALL
SELECT 1, 'Jon' ,'Pager', 'A'
UNION ALL
SELECT 2, 'Paul' ,'Mobile', 'A'
UNION ALL
SELECT 2, 'Paul' ,'Mobile', 'B'
UNION ALL
SELECT 3, 'Art' ,'Mobile', 'B'
UNION ALL
SELECT 3, 'Art' ,'Mobile', 'A'
UNION ALL
SELECT 4, 'Mike' ,'Mobile', 'A'
UNION ALL
SELECT 4, 'Mike' ,'Pager', 'A'
UNION ALL
SELECT 4, 'Mike' ,'Mobile', 'B'
UNION ALL
SELECT 4, 'Mike' ,'Cell', 'A'
UNION ALL
SELECT 5, 'Art' ,'Mobile', 'A'
GO
MCTS: BI 2008, MCITP: BI 2008
Stay Thirsty My Friends
December 2, 2009 at 3:43 pm
Here is a bare bones skeleton script. If you modify it, you should be able to accomplish your goals.
WITH dedupeCTE (COl1,Col2, DuplicateCount)
AS
(
SELECT COl1,Col2,
ROW_NUMBER() OVER(PARTITION BY COl1,Col2 ORDER BY Col1) AS DuplicateCount
FROM DuplicateRcordTable
)
DELETE
FROM dedupeCTE
WHERE DuplicateCount > 1
GO
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
December 4, 2009 at 9:51 am
Jason,
Thanks for your assistance. I get an error message as below:
WITH dedupeCTE (ID, COUNT(*) AS DuplicateCount)
AS
(
SELECT ID, DuplicateCount,
ROW_NUMBER() OVER(PARTITION BY ID, DuplicateCount ORDER BY ID) AS DuplicateCount
FROM DupsDueToUniqueRows
)
DELETE
FROM dedupeCTE
WHERE DuplicateCount > 1
GO
=================================
Msg 207, Level 16, State 1, Line 4
Invalid column name 'DuplicateCount'.
Msg 207, Level 16, State 1, Line 5
Invalid column name 'DuplicateCount'.
Msg 8158, Level 16, State 1, Line 5
'dedupeCTE' has more columns than were specified in the column list
MCTS: BI 2008, MCITP: BI 2008
Stay Thirsty My Friends
December 4, 2009 at 10:25 am
cos_ta393
Other than creating and using a temp table (solely to make it easy for me to test). Jason's code ran fine - now I did use the column names as you specified in your post. Notice neither your post or Jasons showed that the
T-SQL WITH is preceded by a ; be careful it is a requirement.
;WITH dedupeCTE (Id,Name, DuplicateCount)
AS (SELECT Id,Name,
ROW_NUMBER() OVER(PARTITION BY Id,Name ORDER BY Id) AS DuplicateCount
FROM #DupsDueToUniqueRows)
SELECT * --test before deleting
FROM dedupeCTE -- use after testing WHERE DuplicateCount > 1
Results from above are:
IdNameDuplicateCount
1Jon1
1Jon2
1Jon3
2Paul1
2Paul2
3Art1
3Art2
4Mike1
4Mike2
4Mike3
4Mike4
5Art1
December 4, 2009 at 10:53 am
bitbucket-25253 (12/4/2009)
cos_ta393... Notice neither your post or Jasons showed that the
T-SQL WITH is preceded by a ; be careful it is a requirement.
Not quite correct, actually. It isn't required that the CTE start with a ; as it is that the previous statement must be treminated by a ;. Since the use of ; is not manditory to end all statements yet, many SQL programmers have simply gotten into the habit of prefacing CTE's with a ; to eliminate one source of errors when using them.
December 4, 2009 at 11:47 am
Lynn Pettis
Hate to be picky but if the ; is at the end of the last T-SQL code line (not a comment line) before the WITH, then doesn't that semicolon precede (Come before) the WITH statement?
Ah the naunces (fine distinctions) of the american language
December 4, 2009 at 12:05 pm
bitbucket-25253 (12/4/2009)
Lynn PettisHate to be picky but if the ; is at the end of the last T-SQL code line (not a comment line) before the WITH, then doesn't that semicolon precede (Come before) the WITH statement?
Ah the naunces (fine distinctions) of the american language
If there is a statement preceding the CTE, then yes that is accurate.
If there is no statement preceding the CTE, then it may be executed without the ";"
Thanks for bringing up the semicolon.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
December 4, 2009 at 12:07 pm
cos_ta393 (12/4/2009)
Jason,Thanks for your assistance. I get an error message as below:
WITH dedupeCTE (ID, COUNT(*) AS DuplicateCount)
AS
(
...
)
...
Change your cte declaration just a little
WITH dedupeCTE (ID, DuplicateCount)
AS
(
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
December 4, 2009 at 12:09 pm
For semantics, code in previous post would be:
;
WITH dedupeCTE (ID, DuplicateCount)
AS
(
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
December 4, 2009 at 12:14 pm
Semantics can be a pain, but just going by what is in BOL:
When a CTE is used in a statement that is part of a batch, the statement before it must be followed by a semicolon.
😉
December 4, 2009 at 12:26 pm
Lynn Pettis (12/4/2009)
Semantics can be a pain, but just going by what is in BOL:When a CTE is used in a statement that is part of a batch, the statement before it must be followed by a semicolon.
😉
True, and I always omit the preceding semi-colon if the CTE is a stand-alone. It might be a good practice for forum posts to just automatically include it though.
Then again, it would not provide learning points...hmmm.:cool:
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
December 4, 2009 at 2:31 pm
Works perfectly. Jason, bitbucket, Lynn thanks for all of your inputs - really appreciated.
costa
MCTS: BI 2008, MCITP: BI 2008
Stay Thirsty My Friends
December 4, 2009 at 3:16 pm
Cos_ta393
Your welcome.
It makes me feel good that in some small way I may have assisted someone solve a problem for in so doing I find myself learning something new, and I am sure that Jason and Lynn Pettis feel the same way.
December 4, 2009 at 4:24 pm
You're welcome.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
December 4, 2009 at 4:40 pm
Yes, you are welcome. It is always good to hear when we have been able to help.
😀
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply