I have a select statement below which gives the expected results of duplicate data. I want to convert this statement to a DELETE Statement to delete items leaving a single item but getting error. Below is my query. Sample data
SampleOne 2016-10-14 12:44:39.460
SampleOne 2016-10-14 12:44:39.460
SampleTwo 2016-10-14 12:44:39.460
SampleTwo 2016-10-14 12:44:39.460
SELECT t.*
FROM (
SELECT s.* , COUNT(*) OVER (PARTITION BY s.DocumentName, s.SubmitDateTime, s.DocumentType) AS DocCount
FROM Document s
LEFT OUTER JOIN
Staging b ON s.DocumentId = b.DocumentId
LEFT OUTER JOIN Rejections r ON s.DocumentId =r.DocID
WHERE b.DocumentId IS NULL
AND r.DocID IS NULL
AND s.SubmitDateTime IS NOT NULL
AND s.InsertDateTime IS NOT NULL
AND s.DocumentName IS NOT NULL
AND s.Description ='Employee Document'
) t
WHERE t.DocCount > 1
ORDER BY t.DocumentName, t.SubmitDateTime
June 27, 2023 at 7:22 pm
Replace top line "SELECT t.*" with "DELETE t" and remove ORDER BY clause at the end
June 27, 2023 at 7:53 pm
Hi
Thanks for your response.
Doing below as per your suggestion give me error
View or function 't' is not updatable because the modification affects multiple base tables.
DELETE t
FROM (
SELECT s.* , COUNT(*) OVER (PARTITION BY s.DocumentName, s.SubmitDateTime, s.DocumentType) AS DocCount
FROM Document s
LEFT OUTER JOIN
Staging b ON s.DocumentId = b.DocumentId
LEFT OUTER JOIN Rejections r ON s.DocumentId =r.DocID
WHERE b.DocumentId IS NULL
AND r.DocID IS NULL
AND s.SubmitDateTime IS NOT NULL
AND s.InsertDateTime IS NOT NULL
AND s.DocumentName IS NOT NULL
AND s.Description ='Employee Document'
) t
WHERE t.DocCount > 1
June 27, 2023 at 8:35 pm
FIRST, this will delete ALL records rather than "leaving a single item".
I would use ROW_NUMBER()
rather than COUNT()
, because ROW_NUMBER()
guarantees a unique way to identify each row, whereas COUNT()
does not (unless you really do want to delete ALL records).
Replace your LEFT OUTER JOIN
s and NULL checks with NOT EXISTS
.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
June 27, 2023 at 11:53 pm
I think you might want something like this:
WITH CTE AS
(
SELECT s.DocumentId,
ROW_NUMBER() OVER (PARTITION BY s.DocumentName, s.SubmitDateTime, s.DocumentType
ORDER BY s.DocumentName, s.SubmitDateTime) AS RowNum
FROM Document s
WHERE NOT EXISTS(SELECT 1
FROM Staging b
WHERE b.DocumentId = s.DocumentId)
AND NOT EXISTS(SELECT 1
FROM Rejections r
WHERE r.DocID = s.DocumentId)
AND s.SubmitDateTime IS NOT NULL
AND s.InsertDateTime IS NOT NULL
AND s.DocumentName IS NOT NULL
AND s.Description ='Employee Document'
)
DELETE CTE
WHERE RowNum > 1
;
June 28, 2023 at 2:34 am
Hi Bob,
Your query gives me an empty results.
Hi Allan,
I tried your suggestion and I have it working. I will appreciate if anyone can look over my query below for me.
WITH CTE AS (
SELECT s.* , ROW_NUMBER() OVER (PARTITION BY s.DocumentName, s.SubmitDateTime, s.DocumentType order by s.DocumentName, s.SubmitDateTime, s.DocumentType) AS DocCount
FROM Document s
WHERE s.SubmitDateTime IS NOT NULL
AND s.InsertDateTime IS NOT NULL
AND s.DocumentName IS NOT NULL
AND s.Description ='Employee Document'
AND s.DocumentId NOT IN(SELECT v.DocumentId from Staging v)
AND s.DocumentId NOT IN(SELECT r.DocID FROM Rejections r )
)
DELETE FROM CTE
WHERE DocCount > 1
July 1, 2023 at 5:14 am
>> I have a select statement below which gives the expected results of duplicate data. <<
I hate to tell you this, but in RDBMS "expected results of duplicate data" is a really bad thing. The whole goal RDBMS and therefore SQL, was to get rid of duplicates and redundancy. I also see you don't know the part about posting DDL is a convention on SQL forms for last 35 years. So we have no idea what the key is.
It also looks like you've given a single data element two different names same schema! This is a classic beginners error. Is DocumentId the same as DocID ?
If DocumentId really is an identifier, then how can you ever test for null because it will never be null. By the nature of identifiers they exist and have to be known. My guess is that this should not be a problem at all. Your DL should have constraints which will prevent duplication. Remembe we are writing SQL, not COBOL or some other traditional programming language file description..
>> I want to convert this statement to a DELETE Statement to delete items leaving a single item but getting error. Below is my query. Sample data <<
Then why did create redundant dups in the first place??
Here is a guess at what you might have meant, if you were writing SQL:
CREATE TABLE Foobar_Documnts
(document_id CHAR (15) NOT NULL PRIMARY KEY,
creation_timestamp DATETIME2(7) NOT NULL
DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (document_id, creation_timestamp));
You don't seem to understand the concept of the key, which is the foundation of the relational model. Nor, do you seem to understand how to do DDL at all.
Please post DDL and follow ANSI/ISO standards when asking for help.
July 1, 2023 at 4:03 pm
Hi
I have resolved the issue and doesn't need your comment or answer. Your comment is too late.
July 3, 2023 at 6:26 pm
Hi
I have resolved the issue and doesn't need your comment or answer. Your comment is too late.
You just got done saying "I tried your suggestion and I have it working. I will appreciate if anyone can look over my query below for me."
You got what you asked for. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply