November 7, 2016 at 10:17 pm
Comments posted to this topic are about the item Using OVER() to Remove Duplicate Rows
November 7, 2016 at 11:08 pm
Hello Dear Mike,
I think we can also remove duplicates using simplest, easiest and better way without using temp table/s logic:
WITH CTE_InsurancePolicy AS (
SELECT PolicyId, FirstName, Surname, PolicyStartDate, PolicyEndDate
,ROW_NUMBER() OVER (PARTITION BY PolicyId, FirstName, Surname, PolicyStartDate, PolicyEndDate ORDER BY PolicyId) AS RowNumber
FROM dbo.InsurancePolicy
)
DELETE CTE_InsurancePolicy
WHERE RowNumber > 1;
Thanks
Irfan Baig.
November 8, 2016 at 1:07 am
Hi Irfan
Yes, your method is completely valid. The article is just trying to demonstrate how to use the OVER() clause, as there's a thought not enough people are aware of it and how useful it can be.
I agree that there are several ways this problem can be handled, your code shows that. Thanks for taking the time to comment.
Regards,
Mike.
November 8, 2016 at 3:52 am
This is a very useful method to learn
I also use this to check for PK violations when transforming staging records into live data
Could you create a cte for this and delete from the cte?
That way you don't require a temp table
~Sorry, see that point has already been made!
- Damian
November 8, 2016 at 6:58 am
I have been spooked by OVER() and PARTITION for too long now. You made it EZ for me - especially by using a VERY real world example. I have encountered your example so many darn times in the past. I really could have used this solution to make my life easier. Thanks again!
November 8, 2016 at 7:03 am
Hi Mike,
Glad it was useful!
Regards,
Mike.
November 8, 2016 at 9:16 am
Sorry. i don't see how you can delete from the cte and still have the data that you don't want to delete in the true table.
November 8, 2016 at 9:29 am
Hi Shaulbel
Deleting from the CTE will delete the data from the underlying table. This link should explain more:
http://stackoverflow.com/questions/6010233/delete-rows-from-cte-in-sql-server
I prefer using temp tables so you can double-check what you're deleting before you actually delete it, but there are plenty of ways of achieving the same goal.
Thanks for reading,
Mike.
November 8, 2016 at 10:02 am
Thanks.
Learned new thing.
Wasn't aware of delete CTE.
November 8, 2016 at 12:40 pm
I've used OVER() a few times when I copied something from a forum to solve a problem, and hacked it until it gave me the results I needed. I used it, but I never really was clear on just how it worked. Thanks for a good example.
BTW, for this particular problem, another solution would be to GROUP BY all fields into the temp table, giving exactly one copy of each row, clear the main and copy the temp back.
November 8, 2016 at 3:04 pm
Irfan.Baig (11/7/2016)
I think we can also remove duplicates using simplest, easiest and better way without using temp table/s logic:
I think the simplest in this case would be to truncate the table and re-run the import process.
😉
November 8, 2016 at 6:16 pm
I've had to use similar methods to do this exact same thing before. Nice real world example Mike. There are multiple ways to do this without using the OVER() clause but this is a nice simple example of how windowing functions work. Thanks for the write-up.
November 9, 2016 at 2:33 am
Thanks Ken, glad you liked it!
Cheers,
Mike.
November 9, 2016 at 4:53 pm
what about select distinct into ?
steps:
1) Select distinct into a temporal table
2) truncate original table
3) Insert into original table select from temporal table
It seems better
November 9, 2016 at 11:53 pm
There may be any number of 'better' ways to accomplish this particular task, but the post was intended as a tutorial about how OVER() works, and an example of where it can be applied. I think it served admirably for that. I certainly learned something.
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply