May 27, 2020 at 1:24 am
Hi everybody, I need help to find and delete a duplicate line in a table. The row I am trying to delete is actually a duplicate because there are some columns that makes them different.
Please note in the table below. Rows with line_No of 1, 2, 3, and 5. They have the same code, dept_cd, ID, and version.
I need to delete the rows where status = 1 I have many rows with this scenario. The rows that had a status of 1 should have been deleted when the new row with status = 3 was aggregated.
CD DEPT ID LN_NO VERSION STATUS
PO HH 16362135 1 1 3
PO HH 16362135 1 1 1
PO HH 16362135 2 1 3
PO HH 16362135 2 1 1
PO HH 16362135 3 1 1
PO HH 16362135 3 1 3
PO HH 16362135 5 1 1
PO HH 16362135 5 1 3
PO HH 16362135 6 1 3
here is the sql to create the table and insert test rows.
CREATE TABLE [dbo].[TESTPO](
[CD] [varchar](8) NOT NULL,
[DEPT] [varchar](4) NOT NULL,
[ID] [varchar](20) NOT NULL,
[LN_NO] [numeric](10, 0) NOT NULL,
[VERSION] [numeric](10, 0) NOT NULL,
[STATUS] [numeric](10, 0) NOT NULL)
insert into TESTPO
([CD], [DEPT], [ID], [LN_NO], [VERSION], [STATUS])
SELECT 'PO', 'HH', '16362135', 1, 1, 3 UNION ALL
SELECT 'PO', 'HH', '16362135', 1, 1, 1 UNION ALL
SELECT 'PO', 'HH', '16362135', 2, 1, 3 UNION ALL
SELECT 'PO', 'HH', '16362135', 2, 1, 1 UNION ALL
SELECT 'PO', 'HH', '16362135', 3, 1, 1 UNION ALL
SELECT 'PO', 'HH', '16362135', 3, 1, 3 UNION ALL
SELECT 'PO', 'HH', '16362135', 5, 1, 1 UNION ALL
SELECT 'PO', 'HH', '16362135', 5, 1, 3 UNION ALL
SELECT 'PO', 'HH', '16362135', 6, 1, 3
I have tried queries to find duplicate rows but have not worked
Please help.
Thanks
May 27, 2020 at 2:09 am
I need to delete the rows where status = 1
DELETE
FROM TestPO
WHERE Status=1;
?
From your description, it's not at all clear what records you really want deleted. Which record(s) are duplicates? How do we tell?
May 27, 2020 at 2:12 am
I need to delete the rows where status = 1
DELETE
FROM TestPO
WHERE Status=1;
?
From your description, it's not at all clear what records you really want deleted. Which record(s) are duplicates? How do we tell?
Your code works perfectly for the data given but... if there's ever a row where there's only a status of 1 that needs to be kept, your code would delete it.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 27, 2020 at 2:20 am
I think this will be a bit more safe... details are in the comments in the code
WITH cteEnumerate AS
(--==== Number each set of rows in descending order according to status.
SELECT Dupe# = ROW_NUMBER()OVER (PARTITION BY CD,DEPT,ID,LN_NO,VERSION ORDER BY [STATUS] DESC)
,[STATUS]
FROM TESTPO
)--==== Only delete rows from each group that have more than one row and have a status of "1"
DELETE FROM cteEnumerate
WHERE Dupe# > 1
AND Status = 1
;
--Jeff Moden
Change is inevitable... Change for the better is not.
May 28, 2020 at 12:19 am
Jeff, thank you.
I ran the query you provided and it does delete the intended rows.
I will run it against a test table with a lot more data to select the duplicate rows, validate them and then delete.
Thank you!!!!
May 28, 2020 at 5:53 pm
Sounds like a plan... especially the part about validation before deletion. Thanks for the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 28, 2020 at 10:23 pm
DELETE tp
FROM TESTPO tp
WHERE STATUS = 1
AND EXISTS(SELECT *
FROM TESTPO tp2
WHERE tp2.CD = tp.CD
AND tp2.DEPT = tp.DEPT
AND tp2.ID = tp.ID
AND tp2.LN_NO = tp.LN_NO
AND tp2.STATUS = 3)
If you want to just keep the row with the highest status then
DELETE tp
FROM TESTPO tp
WHERE EXISTS(SELECT *
FROM TESTPO tp2
WHERE tp2.CD = tp.CD
AND tp2.DEPT = tp.DEPT
AND tp2.ID = tp.ID
AND tp2.LN_NO = tp.LN_NO
AND tp2.STATUS > tp.STATUS)
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply