December 8, 2015 at 4:18 pm
Hi
Can someone help me get into the thinking of knowing how to fix data in SQL tables (by trying NOT to give me an SQL routines I could run).
Ok, this is the situation…. Suppose I have a single table with has a column called ColumnA which has lots of duplicate values. I need to remove all the duplicate entries from the table in question. Question is….if I had to write pseudo-code as a plan, what SQL should be written
Many thanks to anyone who can offer me any pointers.
Kind Regards
James
December 8, 2015 at 4:24 pm
Not enough information given to answer the question. If this is a child table to another table, then there may be a good reason for duplicate values. Do you have at least a sample table structure and an explanation of what the table is for?
December 8, 2015 at 4:38 pm
The table could contain this for example :-
ColumnA Value
1/1/2015 5
1/1/2015 5
2/1/2015 7
3/1/2015 8
3/1/2015 8
4/1/2015 9
A problem I have at work at the moment is that lots of data has duplicated which has the same date and value.
I have (now hopefully) found a better solution in my job - to run an already written exec command to remove this data which I've scheduled to run tonight - but there is a chance tomorrow that it might not have removed the duplicate data.
December 8, 2015 at 4:59 pm
You're using SQL 2005, right? (Just to be sure I'm not creating a solution that won't work on 2005!)
Here's a query that would identify just the first id....
WITH MyDupesRN AS
(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY SomeDate, NumValue ORDER BY SomeDate, NumValue) AS rn
FROM #Test
)
SELECT SomeDate, NumValue, rn
FROM MyDupesRN
WHERE rn=1;
If you're going to delete, it would be
WITH MyDupesRN AS
(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY SomeDate, NumValue ORDER BY SomeDate, NumValue) AS rn
FROM #Test
)
DELETE
FROM MyDupesRN
WHERE rn>1;
I would run the first query and eyeball some to make sure that you're getting only the records you want... otherwise you could be deleting a lot of records you didn't intend to.
December 9, 2015 at 7:23 am
Hi Piet
Thanks for guiding me. I've didn't even know about CTE before I started this topic. I'm trying to learn it but its just out of my grasp at the moment.
Could you do me a massive favour and convert your code to insert the output into a table. I've realised that I need to run this job out of hours :-
WITH MyDupesRN AS
(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY SomeDate, NumValue ORDER BY SomeDate, NumValue) AS rn
FROM #Test
)
SELECT SomeDate, NumValue, rn
FROM MyDupesRN
WHERE rn=1;
Kind Regards
James
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply