December 10, 2010 at 1:07 pm
I have a database with hundreds of tables. The original deisgn did not have primary keys. As a result the database now has duplicate rows.
Example"
Date_Stamp KW1 KW2
12/10/2010 5.3 3.1
12/10/2010 5.3 3.1
12/9/2010 4 2
12/8/2010 3 1
12/7/2010 7.4 5
12/7/2010 7.4 5
12/7/2010 7.4 5
...
I need to delete duplcate rows (and leave one behind). I want the end result to look like the output of using "select destinct * tableName)
I know how to do this, but I need to be able to automatically scroll throught tables, find the duplicate and delete the extra rows. I thought about using a cursor, but not sure about the T-sql?
The code should do this:
scroll through tables
and for each table check to see if it has duplicate rows
if true delete the extra rows, leaving one behind
How can I do this?
December 10, 2010 at 1:46 pm
It'll take a cursor to do multiple tables efficiently. You could use sp_MSForEachTable, but that's just a hidden cursor.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 10, 2010 at 1:51 pm
I think "sp_MSForEachTable" is a good option. I will look into it.
December 10, 2010 at 2:17 pm
with cte
as (select row_number() over(partition by Date_Stamp order by Date_Stamp) as rn
from mytable)
delete from cte
where rn>1
This works for a single table
No I need to think about a way to scroll through tables
December 10, 2010 at 2:20 pm
Will your table ever have two records on the same day that you want to keep?
Try turning that into a dynamic command as a string, and plug in the table name to the CTE's from clause, then use a cursor to step through it. It's quick and dirty and will work.
Make a backup of the database before you do it, just in case.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 11, 2010 at 9:16 am
Without using sp_ForEachTable you might want to try this code:
DECLARE @Table NVARCHAR(200),@Column NVARCHAR(200),@Sql NVARCHAR(2000)
DECLARE Table_Cursor CURSOR FOR
SELECT so.Name AS 'Table',sc.Name AS 'Column'
FROM sys.sysobjects so
LEFT OUTER JOIN sys.syscolumns sc
ON sc.id = so.id
WHERE so.type = 'u'AND sc.name = 'Date_Stamp'
--or use AND sc.xtype = 40 for all columns with a data type of Date
OPEN Table_Cursor
FETCH NEXT FROM Table_Cursor
INTO @Table,@Column
WHILE @@FETCH_STATUs = 0
BEGIN
--PRINT @Table +' ' + @Column -- for testing only
SET @sql = ';with cte
as (select row_number() over(partition by '+ @Column +' order by ' + @Column +') AS rn
from Dcustomer)
SELECT * FROM cte' --For testing replace with a DELETE statement
-- After you have verified the results
--print @sql
--print '-------------------------'
EXECUTE SP_EXECUTESQL @sql
FETCH NEXT FROM Table_Cursor
INTO @Table,@Column
--PRINT @Table +' ' + @Column for testing only
END
CLOSE Table_Cursor
DEALLOCATE Table_Cursor
As Gsquared has said
Make a backup of the database before you do it, just in case
December 12, 2010 at 6:37 pm
Thanks for the post. Your query worked.
DECLARE @Table NVARCHAR(200),@Column NVARCHAR(200),@Sql NVARCHAR(2000)
DECLARE Table_Cursor CURSOR FOR
SELECT so.Name AS 'Table',sc.Name AS 'Column'
FROM sys.sysobjects so
LEFT OUTER JOIN sys.syscolumns sc
ON sc.id = so.id
WHERE so.type = 'u'AND sc.name = 'Date_Stamp'
--or use AND sc.xtype = 40 for all columns with a data type of Date
OPEN Table_Cursor
FETCH NEXT FROM Table_Cursor
INTO @Table,@Column
WHILE @@FETCH_STATUs = 0
BEGIN
--PRINT @Table +' ' + @Column -- for testing only
SET @sql = ';with cte
as (select row_number() over(partition by '+ @Column +' order by ' + @Column +') AS rn
from ' + @Table + ')
delete FROM cte where rn>1 ' --For testing replace with a DELETE statement
-- After you have verified the results
print @sql
print '-------------------------'
EXECUTE SP_EXECUTESQL @sql
FETCH NEXT FROM Table_Cursor
INTO @Table,@Column
--PRINT @Table +' ' + @Column for testing only
END
CLOSE Table_Cursor
DEALLOCATE Table_Cursor
I tired it on a test database.
Make sure that you have the condition "where rn > 1".
Cleaning up the DB was going to take me a loooong time!. Now with this query I just have to run it and go do something else. Of course, I have to backup the DB (20 GB) before we attempt this.
This is my first post on this Forum!. Thanks a lot.
December 12, 2010 at 8:17 pm
techzone12
Thanks for the feed back.
Now for everyone who wants to help you, the next time you post a question to a forum please provide the table(s) structure, sample data and what you may have attempted following the instructions contained in the first link in my signature block. In that way you are more than likely to receive tested answers .. in other words, help us to help you
Again thanks for the feed back.
December 13, 2010 at 12:53 am
My solution would be creating a stored procedure with table name as input parameter.
And within the SP, I would create a dynamic sql which will delete dublicate rows. You can refer to How to delete duplicate records or rows among identical rows in a table where no primary key exists for sample codes.
And I think it is easy to use sp_MSForEachTable to call the above SP for each table in database.
December 21, 2010 at 2:22 pm
I want to implement the below scenario.
If ther is any duplicate data then my SQL should run and find the duplicate rows and then delete them automatically.
Do you have a dynamic SQl which identifies and delete the duplicate records automatically. In this case the table contains primary key. Thanks in advance!
December 21, 2010 at 2:31 pm
ns_18 (12/21/2010)
I want to implement the below scenario.If ther is any duplicate data then my SQL should run and find the duplicate rows and then delete them automatically.
Do you have a dynamic SQl which identifies and delete the duplicate records automatically. In this case the table contains primary key. Thanks in advance!
Scroll back up about four postings .... read and understand what that T-SQL will do ... and then test it... if you have any additional questions... post them in a new forum.
December 21, 2010 at 2:35 pm
ns_18 (12/21/2010)
I want to implement the below scenario.If ther is any duplicate data then my SQL should run and find the duplicate rows and then delete them automatically.
Do you have a dynamic SQl which identifies and delete the duplicate records automatically. In this case the table contains primary key. Thanks in advance!
IF the table has Primary Key then how cum duplicate records will be there in the Table;-) 😉
Thanks
Parthi
Thanks
Parthi
December 21, 2010 at 2:47 pm
I want to implement the below scenario.
If ther is any duplicate data then my SQL should run and find the duplicate rows and then delete them automatically.
Do you have a dynamic SQl which identifies and delete the duplicate records automatically. In this case the table contains primary key. Thanks in advance!
December 21, 2010 at 2:48 pm
I want to implement the below scenario.
If ther is any duplicate data then my SQL should run and find the duplicate rows and then delete them automatically.
Do you have a dynamic SQl which identifies and delete the duplicate records automatically. In this case the table contains primary key. Thanks in advance!
December 21, 2010 at 3:06 pm
ns_18 (12/21/2010)
I want to implement the below scenario.If ther is any duplicate data then my SQL should run and find the duplicate rows and then delete them automatically.
Do you have a dynamic SQl which identifies and delete the duplicate records automatically. In this case the table contains primary key. Thanks in advance!
have a look at the same post Eralper http://www.sqlservercentral.com/Forums/Topic1033202-392-1.aspx has given the solution try the way he said looks good with no primary key
In this case the table contains primary key. again i am asking
IF the table has Primary Key then how cum duplicate records will be there in the Table????????
If the table contains PY Key then it will not have Duplicate records
Thanks
Parthi
Thanks
Parthi
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply