There is a common problem in which, you have header data which contains
columns used to evaluate a delete statement. For example, suppose you have
an order table and an order lines table. You may have an operation that
needs to purge rows from the orders based upon a date in the order header
table. Deleting the header rows is easy, but since the detail rows do not
have the date column deleting them is not as straight forward. This could
be accomplished via a cascading delete trigger on the order header.
However, many times legacy systems are not architected to accommodate
cascading deletes.
Therefore, a delete statement will need to be written over the children
tables as well as the parent tables. Ideally, these tables would contain
appropriate foreign key constraints. Additionally, the deletes would be
wrapped in a transaction to prevent orphans. This article is intended to
illustrate the delete operations and focus only on those operations.
Let's assume we have the following orders table:
CREATE TABLE Orders ( OrdNo char (10) NOT NULL , OrdDate datetime NULL , OrdCustName varchar (50) NULL , CONSTRAINT PK_Orders PRIMARY KEY CLUSTERED ( OrdNo ) )
The order details table looks like this:
CREATE TABLE OrderLines ( OrdNo char (10) NOT NULL , OrdSeq int NOT NULL , OrdItemNo char (10) NOT NULL , OrdQtyint NOT NULL CONSTRAINT PK_OrderLines PRIMARY KEY CLUSTERED ( OrdNo, OrdSeq ) CONSTRAINT FK_OrderLines_Orders FOREIGN KEY ( OrdNo ) REFERENCES dbo.Orders ( OrdNo ))
Here's some sample data:
-- Insert Header Data INSERT INTO Orders (OrdNo, OrdDate, OrdCustName) VALUES (N'1', '2003-02-01', N'Billy Smith') INSERT INTO Orders (OrdNo, OrdDate, OrdCustName) VALUES (N'2', '2003-02-02', N'Mike Bossy') INSERT INTO Orders (OrdNo, OrdDate, OrdCustName) VALUES (N'3', '2003-02-09', N'Denis Potvin') INSERT INTO Orders (OrdNo, OrdDate, OrdCustName) VALUES (N'4', '2003-03-01', N'Bobby Nystrom') -- Insert Detail Data INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'1', 1, N'1-1', 10) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'1', 2, N'1-2', 12) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'1', 3, N'1-3', 8) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'2', 1, N'2-1', 9) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'2', 2, N'2-2', 2) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'2', 3, N'2-3', 5) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'2', 4, N'2-4', 20) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'2', 5, N'2-5', 36) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'3', 1, N'3-1', 48) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'4', 1, N'4-1', 12) INSERT INTO OrderLines (OrdNo, OrdSeq, OrdItemNo, OrdQty) VALUES (N'4', 2, N'4-2', 15) If you look at the data you will see that the 4 orders have a total of 11 detail lines:
SELECT * FROM Orders O LEFT OUTER JOIN OrderLines OL ON O.OrdNo = OL.OrdNo OrdNo OrdDate OrdCustName OrdNo OrdSeq OrdItemNo Qty ------ ------------------------ --------------- ------ ------- ---------- --- 1 2003-02-01 00:00:00.000 Billy Smith 1 1 1-1 10 1 2003-02-01 00:00:00.000 Billy Smith 1 2 1-2 12 1 2003-02-01 00:00:00.000 Billy Smith 1 3 1-3 8 2 2003-02-02 00:00:00.000 Mike Bossy 2 1 2-1 9 2 2003-02-02 00:00:00.000 Mike Bossy 2 2 2-2 2 2 2003-02-02 00:00:00.000 Mike Bossy 2 3 2-3 5 2 2003-02-02 00:00:00.000 Mike Bossy 2 4 2-4 20 2 2003-02-02 00:00:00.000 Mike Bossy 2 5 2-5 36 3 2003-02-09 00:00:00.000 Denis Potvin 3 1 3-1 48 4 2003-03-01 00:00:00.000 Bobby Nystrom 4 1 4-1 12 4 2003-03-01 00:00:00.000 Bobby Nystrom 4 2 4-2 15 (11 row(s) affected)
Now suppose you wanted to delete the rows for the orders in February 2003.
You could identify them with the following select statement:
SELECT * FROM Orders O LEFT OUTER JOIN OrderLines OL ON O.OrdNo = OL.OrdNo WHERE OrdDate < '2003-03-01' OrdNo OrdDate OrdCustName OrdNo OrdSeq OrdItemNo Qty ------ ------------------------ --------------- ------ ------- ---------- --- 1 2003-02-01 00:00:00.000 Billy Smith 1 1 1-1 10 1 2003-02-01 00:00:00.000 Billy Smith 1 2 1-2 12 1 2003-02-01 00:00:00.000 Billy Smith 1 3 1-3 8 2 2003-02-02 00:00:00.000 Mike Bossy 2 1 2-1 9 2 2003-02-02 00:00:00.000 Mike Bossy 2 2 2-2 2 2 2003-02-02 00:00:00.000 Mike Bossy 2 3 2-3 5 2 2003-02-02 00:00:00.000 Mike Bossy 2 4 2-4 20 2 2003-02-02 00:00:00.000 Mike Bossy 2 5 2-5 36 3 2003-02-09 00:00:00.000 Denis Potvin 3 1 3-1 48 (9 row(s) affected)
We all know that you cannot add a JOIN clause to a DELETE statement, so the
following will not work:
DELETE FROM Orders O LEFT OUTER JOIN OrderLines OL ON O.OrdNo = OL.OrdNo WHERE OrdDate < '2003-03-01'
It looks like it will do what we are trying to accomplish but if you execute
this statement you will delete all rows in OrderLines. This is because the
EXISTS clause will always be true. There is no link from the sub query to the
outer query. The way this should be done is as follows:
DELETE FROM OrderLines WHERE EXISTS ( SELECT OrdNo FROM Orders WHERE OrdDate < '2003-03-01' AND Orders.OrdNo = OrderLines.OrdNo )
Notice that now we are linking the sub query to the outer query through the
WHERE clause. Specifically, we are linking the OrdNo columns via Orders.OrdNo =
OrderLines.OrdNo. This forces the DELETE statement to consider the values
returned from the sub query as they compare to the values in the table be
deleted from.
Finally, we would want to delete the row from the header table as well. The
whole transaction might look like this:
BEGIN TRANSACTION DELETE FROM OrderLines WHERE EXISTS ( SELECT OrdNo FROM Orders WHERE OrdDate < '2003-03-01' AND Orders.OrdNo = OrderLines.OrdNo ) DELETE FROM Orders WHERE OrdDate < '2003-03-01' COMMIT