November 2, 2015 at 7:26 am
I have a situation where I have table with over a billion records and needs to be scrubbed. Table does have a field with date time timestamp. I have been deleting rows from the table using the script below which basically provides me delete statements by date for records older than 90 days.
But now on each day row count is over 30 million rows and it takes forever to delete by date and transaction log becomes humongous.
So I would like to scrub it in 5 minute intervals instead of daily for records older than 90 days. Even in 5 minute intervals the record count tends to be around a million. This will keep the delete slice small enough to help not a gigantic transaction log.
I was wondering if someone has any thoughts on how to accomplish this?
declare @startdate Datetime
declare @enddate Datetime
set @startdate = getdate()-480
set @enddate = getdate()-90
--set @vStart = select convert(varchar,@startdate, 102)
print @startdate
print @enddate
WHILE (@startdate < @enddate)
BEGIN
print 'delete from vending where DetectedDate < ''' + CONVERT(varchar(10), @startdate, 101) +''''
set @startdate = @startdate+1
END
I am hoping to modify the script above to produce a script with statements like this for a window between last 90 and 120 days:
delete from vending where DetectedDate <'6/15/2015 8:25:00 PM'
go
delete from vending where DetectedDate <'6/15/2015 8:30:00 PM'
go
delete from vending where DetectedDate <'6/15/2015 8:35:00 PM'
go
thanks in advance
November 2, 2015 at 2:34 pm
It's best to delete by the clustering key.
Therefore, what is the table clustered on? If it's clustered on identity -- gack! -- but you have an index on datetime, you can do the conversion yourself from datetime to id and then delete by id.
If possible, put the db in bulk-logged or even simple recovery model while you're doing the deletes.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 2, 2015 at 6:32 pm
sqlstar2011 (11/2/2015)
I have a situation where I have table with over a billion records and needs to be scrubbed. Table does have a field with date time timestamp. I have been deleting rows from the table using the script below which basically provides me delete statements by date for records older than 90 days.But now on each day row count is over 30 million rows and it takes forever to delete by date and transaction log becomes humongous.
So I would like to scrub it in 5 minute intervals instead of daily for records older than 90 days. Even in 5 minute intervals the record count tends to be around a million. This will keep the delete slice small enough to help not a gigantic transaction log.
I was wondering if someone has any thoughts on how to accomplish this?
declare @startdate Datetime
declare @enddate Datetime
set @startdate = getdate()-480
set @enddate = getdate()-90
--set @vStart = select convert(varchar,@startdate, 102)
print @startdate
print @enddate
WHILE (@startdate < @enddate)
BEGIN
print 'delete from vending where DetectedDate < ''' + CONVERT(varchar(10), @startdate, 101) +''''
set @startdate = @startdate+1
END
I am hoping to modify the script above to produce a script with statements like this for a window between last 90 and 120 days:
delete from vending where DetectedDate <'6/15/2015 8:25:00 PM'
go
delete from vending where DetectedDate <'6/15/2015 8:30:00 PM'
go
delete from vending where DetectedDate <'6/15/2015 8:35:00 PM'
go
thanks in advance
1. Which Edition of SQL Server do you have?
2. How much space does the table currently occupy?
3. How much disk free space do you have?
4. Can you provide the DDL for the table in question including all constraints, indexes, and FK's that may point to the table?
--Jeff Moden
Change is inevitable... Change for the better is not.
September 30, 2016 at 8:19 am
I deal with a lot of large data tables and have found this article by Aaron Bertrand one of the best I have seen. It shows the research and results clearly. I highly recommend his breaking up large deletes methodology.
https://sqlperformance.com/2013/03/io-subsystem/chunk-deletes
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply