June 11, 2013 at 8:16 am
I've taken a look at the scripts and forum posts on the site regarding finding duplicate rows, but I haven't been able to adapt them to my situation. I have data being pulled from Oracle and placed in a table in my sql server db. The table consists of a primary key identity field, a field containing the oracle ID, and 4 fields containing data. I need to find all duplicates, with a duplicate being defined as a match across all 4 data fields.
I've tried queries along the following lines, but they don't work.
SELECTMAX(id), MAX(oracle_id), field1, field2, field3, field4
FROMoracleRecords
GROUP BYfield1, field2, field3, field4
HAVING COUNT(*) > 1
If I understand how this is supposed to work, only duplicate rows should be returned. Is this the case? I've gotten it to work when comparing only one field, but when I try to compare more than one I break it. Any suggestions?
June 11, 2013 at 8:19 am
You might want to try this sort of query instead
SELECTid, oracle_id, field1, field2, field3, field4,
COUNT(*) OVER(PARTITION BY field1, field2, field3, field4) AS NumberOfDuplicates
FROMoracleRecords
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537June 12, 2013 at 11:06 am
That worked perfectly. Thanks!
June 12, 2013 at 11:49 am
One further question: Is it possible to identify the newest record of each group and delete the others automatically? One of the fields is a datestamp, but I don't understand how to isolate the newest records.
For example.
Field1 Field2 Field3 Field4
AAA AAA AAA 18 Dec 2012 <- delete
AAA AAA AAA 19 Dec 2012
BBB BBB BBB 1 Jan 2013 <- delete
BBB BBB BBB 3 Jan 2013 <- delete
BBB BBB BBB 6 Jan 2013
Is this possible through tsql?
June 13, 2013 at 1:41 am
WITH CTE AS (
SELECTid, oracle_id, field1, field2, field3, field4,
ROW_NUMBER() OVER(PARTITION BY field1, field2, field3 ORDER BY field4 DESC) AS rn
FROMoracleRecords)
DELETE FROM CTE
WHERE rn > 1;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply