May 5, 2017 at 12:17 pm
Hello all,
I'm supporting a database that I inherited so now having to handle some issues that are now arising. In particular, I have a stored proc which just started giving me fits. This proc builds a temp table with a bunch of records within a date range, then deletes records within that same date range from a physical table (Table A), then inserts the records from the temp table into that physical table -- essentially providing new updated data in that physical table.
The sp is now hanging on the DELETE FROM TABLE A WHERE DATE BETWEEN X AND X.
I don't know if there's a better way to delete and insert a large number of records but was reading something about a variation, using a derived table like so:
delete t1
from (select top (10000) *
from t1 order by a) t1
I was going to rearrange my DELETE statement to something like:
DELETE TABLE A
FROM
(SELECT *
FROM TABLE A
WHERE DATE BETWEEN X AND X)
But the syntax looked a little odd to me. Mind you, there's no primary key on TABLE A so I can't do a DELETE FROM TABLE A WHERE ID IN (SELECT ID FROM #TEMPTABLE)
Any thoughts out there on the best way to handle this?
(BTW, after the delete is a pretty typical insert statement: INSERT TABLE A (col1, col2...) SELECT X, Y, Z FROM #TEMPTABLE)
Any help would be greatly appreciated!
Thanks
May 5, 2017 at 12:29 pm
Tacy,
Assuming you are not in a position to add a PK I would suggest adding an Index on your Date Field and depending how "wide" you data is to cover the rest of the table in the include.
Regards,
Matt
May 5, 2017 at 12:33 pm
Thanks, Matt. But I'm not sure I'm following. There is an index on the date column. (Non Unique, Non Clustered). I'm looking to delete everything in the physical table in that date range.
What exactly are you suggesting?
May 5, 2017 at 12:58 pm
Can you give us an idea how many rows are in the table and how many, typically, are in one deletion batch?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 5, 2017 at 1:04 pm
There are around 1.5 million rows in the physical table. I'm trying to delete around 70K-100K rows on the delete statement.
May 5, 2017 at 1:21 pm
Wild idea:
Why don't you simply update the table instead of deleting and re-inserting?
May 5, 2017 at 2:20 pm
An update is not an option here.
Any other ideas? I'm getting desperate....
May 5, 2017 at 2:34 pm
tacy.highland - Friday, May 5, 2017 2:20 PMAn update is not an option here.
Any other ideas? I'm getting desperate....
This will require some code changes, but have you considered adding a new BIT column to your table called IsDeleted (or whatever).
Instead of deleting the rows in the proc, just set the IsDeleted flag to 1.
Have a separate process running periodically to batch delete the rows where IsDeleted = 1.
Any code accessing the table would have to be modified to exclude rows where IsDeleted = 1.
This would speed up your proc, yet achieve the same end result.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 5, 2017 at 3:35 pm
My original thought was that since there was no PK there where no indexes either. Sorry that won't work.
My only other thought is to do the opposite of what you are trying to do. Copy the Good Data into a new/temp table, truncate the old table, copy the good data back into you table or similarly do it with a series of renaming tables.
Hope this helps.
Regards,
Matt
May 5, 2017 at 4:11 pm
I can't truncate the table because I'm only updating/reinserting some records, not all. That's why I have to specify the date range in the delete statement.
May 6, 2017 at 9:41 am
Something like the following perhaps?
USE tempdb;
GO
-- Create and populate a 1.5M row TestTable...
IF OBJECT_ID('tempdb.dbo.TestData', 'U') IS NULL
BEGIN
CREATE TABLE dbo.TestData (
ID INT NOT NULL PRIMARY KEY CLUSTERED,
SomeDate DATE NOT NULL
);
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),
cte_Tally (n) AS (
SELECT TOP 1500000
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n3 a CROSS JOIN cte_n3 b
)
INSERT dbo.TestData (ID, SomeDate)
SELECT
ID = t.n,
SomeDate = DATEADD(dd, CHECKSUM(NEWID()) % 5000, '2010-01-01')
FROM
cte_Tally t;
-- Add a nonclustered index to the TestData table to speed up the following delete operation...
CREATE NONCLUSTERED INDEX ix_TestData_SomeDate ON dbo.TestData (SomeDate);
END;
--=====================================================================
-- Create a temp table to hold the deleted rows...
IF OBJECT_ID('tempdb..#Deleted', 'U') IS NOT NULL
DROP TABLE #Deleted;
CREATE TABLE #Deleted (
ID INT,
SomeDate DATE
);
-- Delete the desired range of rows and use the OUTPUT clause to add the deleted rows to the temp table...
DELETE td
OUTPUT Deleted.*
INTO #Deleted
FROM
dbo.TestData td
WHERE
td.SomeDate >= '2010-01-01'
AND td.SomeDate < '2012-01-01';
--=====================================================================
-- >>> Deleted rows are now in #Deleted and able to be "updated" as desired... <<<
--=====================================================================
-- Reinsert the previously deleted (and subsiquently updated) rows back to the original table...
INSERT dbo.TestData (ID, SomeDate)
SELECT
d.ID,
d.SomeDate
FROM
#Deleted d;
May 6, 2017 at 1:00 pm
tacy.highland - Friday, May 5, 2017 2:20 PMAn update is not an option here.
Any other ideas? I'm getting desperate....
Why is UPDATE not an option?
Your table doesn't have a PK, does it have a clustered index?
Is there an index on the date column?
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
May 6, 2017 at 8:59 pm
tacy.highland - Friday, May 5, 2017 12:17 PMThe sp is now hanging on the DELETE FROM TABLE A WHERE DATE BETWEEN X AND X.
If it was working fine before and is now suddenly "hanging", that's a pretty good indication of two things... 1) parameter sniffing and 2) statistics that haven't been rebuilt in a while.
The first can be fixed by adding an option to recompile and should probably be done at the statement level rather than the SP level. Even that won't help, though, if statistics are out of date. The fix for that, of course, is to rebuild the statistics. If the SP is used on a regular basis to delete and the insert a decent number of rows on a regular basis, the number of rows may still not be big enough to trigger an auto-update of stats. If you figure it out and do a little planning, you might want to update either index or column stats at the beginning of the proc every time it runs only for the stats that the SP uses and then rebuilt all stats that have a modified rowcount for the whole table.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 7, 2017 at 2:05 am
Jeff Moden - Saturday, May 6, 2017 8:59 PMtacy.highland - Friday, May 5, 2017 12:17 PMThe sp is now hanging on the DELETE FROM TABLE A WHERE DATE BETWEEN X AND X.If it was working fine before and is now suddenly "hanging", that's a pretty good indication of two things... 1) parameter sniffing and 2) statistics that haven't been rebuilt in a while.
The first can be fixed by adding an option to recompile and should probably be done at the statement level rather than the SP level. Even that won't help, though, if statistics are out of date. The fix for that, of course, is to rebuild the statistics. If the SP is used on a regular basis to delete and the insert a decent number of rows on a regular basis, the number of rows may still not be big enough to trigger an auto-update of stats. If you figure it out and do a little planning, you might want to update either index or column stats at the beginning of the proc every time it runs only for the stats that the SP uses and then rebuilt all stats that have a modified rowcount for the whole table.
Seems most likely to me. Less likely is that the table is a heap (we don't know yet) and repeated delete/insert cycles have increased the size of the table somewhat. There's a good article covering this here.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
May 13, 2017 at 6:55 pm
Are there other indexes on the table, apart of the non clustered on Date?
_____________
Code for TallyGenerator
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply