September 3, 2003 at 6:35 am
I have a table containing about 2500000 rows out of this i have found that 750 rows have been duplicated thats 1500 rows, is there a possibility to delete the duplicated 750 rows only
Thanks in advance
September 3, 2003 at 6:40 am
Begin Transaction
Create table #temp(inculde the columns which are in Source table)
Insert into #temp values Select distinct * from
source table
delete from source table
Insert into source table select * from #temp
if @@error=0
commit transaction
else
rollback transaction
drop table #temp
erajendar
erajendar
September 3, 2003 at 6:40 am
Here's a script by Chris Cubley which I have used a couple times.
http://www.sqlservercentral.com/scripts/contributions/395.asp
Cheers,
Crispin
Something as incredibly simple as
binary still gives you too many options
Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!
September 3, 2003 at 8:00 am
Chris Cubley also wrote an excellent article on this topic that includes the scripts you will need to identify and remove the duplicate records. I tested the scripts on my own system and everything works A-OK !!!
Here's the article:
http://www.sqlservercentral.com/columnists/ccubley/findinganddeletingduplicatedata.asp
September 3, 2003 at 9:28 am
You can also add an identity column to the table to help with deleting duplicates. Here is an example of some code to do that:
alter table dup_table
add seq_num int identity
go
delete from a
from dup_table a join
(select col1, col2, max(seq_num) max_seq_num from dup_table
group by col1, col2
having count(*) > 1) b
on a.col1 = b.col1 and
a.col2 = b.col2 and
a.seq_num < b.max_seq_num
go
alter table dup_table
drop column seq_num
Gregory Larsen, DBA
If you looking for SQL Server Examples check out my website at http://www.geocities.com/sqlserverexamples
Gregory A. Larsen, MVP
September 3, 2003 at 11:04 am
Many Thanks to all you Guys for your help
Special Thanks Greg, your Solution did the trick.
Thanks once again.
September 4, 2003 at 9:06 am
If you can't touch the table structure, a loop using SET ROWCOUNT will limit the number of records deleted. Perhaps a cursor with using a select something like
SELECT Col1, Col2, Col3, Count(*) - 1
FROM TheTableWithDups
GROUP BY @Col1, @Col2, @Col3
HAVING COUNT(*) > 1
Fetch into @Col1, @Col2, @Col3, @NumberToDelete
In the cursor loop...
BEGIN TRAN -- make sure the count is and stays correct or bad things could happen!
SELECT @NumberToDelete = COUNT(*) - 1
FROM TheTableWithDups
WHERE Col1 = @Col1 AND
Col2 = @Col2 AND
Col3 = @Col3
IF @NumberToDelete > 0
BEGIN
SET ROWCOUNT = @NumberToDelete
DELETE FROM TheTableWithDups
WHERE Col1 = @Col1 AND
Col2 = @Col2 AND
Col3 = @Col3
END
COMMIT
I have not tried this, but I think it would work. If there is no chance that the table will be changed (perhaps via a table lock), then no extra check or transaction would be required.
RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply