May 14, 2020 at 7:44 pm
I am trying to delete duplicate rows, but I get an error message saying that 'RowNumber' is an invalid column even though I have that column (as an alias). I tried running the SELECT and DELETE statement together and it still didn't work.
I know it doesn't exist in the base table because it is a calculated field, but I thought SQL should recognize it if I run the SELECT and DELETE statements together.
SELECT ord_no, purch_amt, ord_date, customer_id, salesman_id, row_number( ) OVER (PARTITION BY salesman_id ORDER BY salesman_id) as RowNumber --partitioning happens first, then function is applied, then ORDER BY is applied
FROM orders;
DELETE
FROM orders
WHERE RowNumber > 1;
May 14, 2020 at 8:08 pm
You have 2 separate statements here - the second statement doesn't know anything about the first statement.
There are 2 ways to do this...use a CTE or a derived table:
--==== Common Table Expression
With duplicateOrders
As (
Select ord_no
, purch_amt
, ord_date
, customer_id
, salesman_id
, row_number() over(Partition By salesman_id Order By salesman_id) As RowNumber --partitioning happens first, then function is applied, then ORDER BY is applied
From orders
)
Delete
From duplicateOrders
Where RowNumber > 1;
--==== Derived Table
Delete do
From (
Select ord_no
, purch_amt
, ord_date
, customer_id
, salesman_id
, row_number() over(Partition By salesman_id Order By salesman_id) As RowNumber --partitioning happens first, then function is applied, then ORDER BY is applied
From orders
) As do
Where RowNumber > 1;
To be safe - wrap the code in an explicit transaction and rollback to verify the results.
Begin Transaction;
--==== Common Table Expression
With duplicateOrders
As (
Select ord_no
, purch_amt
, ord_date
, customer_id
, salesman_id
, row_number() over(Partition By salesman_id Order By salesman_id) As RowNumber --partitioning happens first, then function is applied, then ORDER BY is applied
From orders
)
Delete
From duplicateOrders
Where RowNumber > 1;
--==== Verify Results
Select ord_no
, purch_amt
, ord_date
, customer_id
, salesman_id
, row_number() over(Partition By salesman_id Order By salesman_id) As RowNumber --partitioning happens first, then function is applied, then ORDER BY is applied
From orders;
--==== Rollback/Commit
Rollback Transaction;
--Commit Transaction; --uncomment this line, comment out the rollback to commit the changes after validation
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
May 14, 2020 at 8:10 pm
The select statement and the delete statement are two different, separate statements. The delete statement won't have any knowledge of any aliases in the select statement. I am guessing you meant to do with with a CTE as the queries you have are often what you would see with deleting duplicates with a CTE. Try putting your first query in a CTE and then delete from the CTE where RowNumber > 1. You have the right idea, you just need to utilize the CTE for this.
Sue
May 15, 2020 at 1:37 am
This makes sense. When using the DELETE statement usually there is nothing following the delete, however you have the alias of the derived table, after DELETE. If I omit do after DELETE, it doesn't work, so it's necessary.
Why would this be necessary, when normally you wouldn't have anything after DELETE such as when you are deleting from a hard table?
May 15, 2020 at 7:03 am
You need to alias after the delete to know where to delete from.
If you had a DELETE with multiple joins in the logic to only delete the right rows, you need to tell the delete which table it needs to remove the records from.
Its the same with the derived table, as the name of the derived table in Jeffery's code is aliased "do" it needs to know to delete from that particular derived query.
eg the below works
DELETE a
FROM
TableA a
INNER JOIN TableB b
ON b.BId = a.BId
where as this wont as the engine doesn't know if to delete from TableA or TableB
DELETE
FROM
TableA a
INNER JOIN TableB b
ON b.BId = a.BId
December 19, 2020 at 7:45 pm
The select statement and the delete statement are two different, separate statements. The delete statement won't have any knowledge of any aliases in the select statement. I am guessing you meant to do with with a CTE as the queries you have are often what you would see with deleting duplicates with a CTE. Try putting your first query in a CTE and then delete from the CTE where RowNumber > 1. You have the right idea, you just need to utilize the CTE for this.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply