September 4, 2015 at 2:32 am
Add an index to date and loop round that field then, be quicker than using a view.
September 4, 2015 at 3:10 am
without "while"...
Delete top xxx from yyy where zzzz
go 1000:-)
September 4, 2015 at 5:00 am
radekkokes (9/4/2015)
without "while"...Delete top xxx from yyy where zzzz
go 1000:-)
The problem is that is that 1000 might be too many or too few.
September 4, 2015 at 6:06 am
on one of the places I worked a few years back, we had similar situation.
although we had an update/upgrade windows from 12AM to 6AM once or twice per month this was not entirely off the grid time. it was simply known that we had to designate a 2 hour window starting from 12AM to 2AM as off time (meaning the system might be /will be unreachable during that period)
after that it would be slow with intermittent downtime. as in it might become unreachable for 5-10 min now and then.
we also had live tables that would not be down at all but had to be cleaned.
so, we set up job running at 15 min interval, that would select a several thousand records ids at a time with a (nolock) clause and put them into cache table. than we had a scheduled job that would run every hour at first and delete 100 records at a time, only at night when the traffic to the table was a lot less.
after the initial clean up (it took about a week to clean up almost 50 mill records from 80mill) we switched the job to run every 2 hour from 12AM to 8AM . the table was clean and small after that.we only needed active records in there as database where backed up every 3 hours. add to that regular db maintenance with shrink log and data files when ever possible we had the db running lean and clean.
the technique in this article is OK to use as an ad-hock thing but
not on regular basis and specially not on live table. might run into big trouble one day.
September 4, 2015 at 6:14 am
Vlad-207446 (9/4/2015)
on one of the places I worked a few years back, we had similar situation.although we had an update/upgrade windows from 12AM to 6AM once or twice per month this was not entirely off the grid time. it was simply known that we had to designate a 2 hour window starting from 12AM to 2AM as off time (meaning the system might be /will be unreachable during that period)
after that it would be slow with intermittent downtime. as in it might become unreachable for 5-10 min now and then.
we also had live tables that would not be down at all but had to be cleaned.
so, we set up job running at 15 min interval, that would select a several thousand records ids at a time with a (nolock) clause and put them into cache table. than we had a scheduled job that would run every hour at first and delete 100 records at a time, only at night when the traffic to the table was a lot less.
after the initial clean up (it took about a week to clean up almost 50 mill records from 80mill) we switched the job to run every 2 hour from 12AM to 8AM . the table was clean and small after that.we only needed active records in there as database where backed up every 3 hours. add to that regular db maintenance with shrink log and data files when ever possible we had the db running lean and clean.
the technique in this article is OK to use as an ad-hock thing but
not on regular basis and specially not on live table. might run into big trouble one day.
Actually deleting rows in small batches in a loop, accessing the rows on an indexed column is about the best way you can delete rows in a table and if the batch size is small enough it won't affect current users and it can run through until all the required rows are deleted.
September 4, 2015 at 6:30 am
Jonathan AC Roberts >> I never sad otherwise.
it's just some times it is not optimal to run this kind of process manually on a LIVE database and think that is the best option you have. when dealing with this range of records, and we are talking about millions of records here and a LIVE db to boot , I am most sure nobody expect you to make it lean and clean NOW. there is always a time period for when it have to be done. and if this is a 24/7/365 kind of environment, it means that it have to be done on regular basis after the initial cleaning. hence it might be more prudent for you to come up with a process that will do all that and will be doing it long after the initial cleanup is done, and doing it reliably. that's all.
at the job in my example we had go through many techniques including one just like in the article and for our case the one we pick was most optimal and workable. as a matter of fact it is still there even after 5 year run. a bit streamlined and optimized but still used. also since we developed and maintained the front end for the customer, we changed the application to not do the in-time deletes but simply insert record id into the job monitored table to be deleted on schedule. all selects where adjusted to filter active records on ids in the pending deletes table as well.
September 4, 2015 at 6:40 am
P Jones (3/6/2013)
Itzik Ben-Gan teachesWHILE 1=1
BEGIN
DELETE TOP (5000) FROM table WHERE .....
IF @@ROWCOUNT < 5000 BREAK
END
and you can add a WAITFOR to delay between iterations
Adjust the 5000 to suit your table.
+1
Gerald Britton, Pluralsight courses
September 4, 2015 at 6:41 am
jims-723592 (3/6/2013)
Why keep counts at all?Let all logic live in the view, then while any entry in the view, remove
ALTER PROCEDURE [dbo].[usp_VTrickleDeleteEvent]
AS
SET NOCOUNT ON
SET QUOTED_IDENTIFIER OFF
BEGIN
WHILE exists (Select 1 from VEventCleanup)
BEGIN
BEGIN TRAN RemoveEvent
DELETE FROM VEventCleanup –- delete from view with order by
COMMIT TRAN -- COMMIT to release locks
END
END
+1
Gerald Britton, Pluralsight courses
September 4, 2015 at 6:51 am
David Walker-278941 (3/6/2013)
But!!!!!ORDER IS IGNORED IN A VIEW, starting with SQL 2005. Even with Top 100 Percent specified, starting with (I believe) SQL 2008. See Books Online and many online posts. How does that affect this article?
(From one blog entry: When you save a VIEW definition, the ORDER BY is ignored on the returnset. That is by SQL ANSI standards.)
That's what I understand. If that's wrong, please let me know. Thanks.
that's wrong. Here's the quote:
The ORDER BY clause is used only to determine the rows that are returned by the TOP or OFFSET clause in the view definition. The ORDER BY clause does not guarantee ordered results when the view is queried, unless ORDER BY is also specified in the query itself.
Ref: CREATE VIEW (Transact-SQL)
Gerald Britton, Pluralsight courses
September 4, 2015 at 6:52 am
Edward.Polley 76944 (3/6/2013)
I am a big fan of the 2 temp table approach and wrote about that on a previous article published at SQL Central. The difference here is to pull the rows in primary key sequence using an ORDER BY so I had to switch to this loop thru the view approach.
That doesn't work the way you think:
Important:
The ORDER BY clause is used only to determine the rows that are returned by the TOP or OFFSET clause in the view definition. The ORDER BY clause does not guarantee ordered results when the view is queried, unless ORDER BY is also specified in the query itself.
Gerald Britton, Pluralsight courses
September 4, 2015 at 6:53 am
Edward.Polley 76944 (3/6/2013)
Try this.CREATE VIEW dbo.test as (SELECT top (100) * from table ORDER BY primary_key_column). Look at the result they will be ordered.
Important
The ORDER BY clause is used only to determine the rows that are returned by the TOP or OFFSET clause in the view definition. The ORDER BY clause does not guarantee ordered results when the view is queried, unless ORDER BY is also specified in the query itself.
Gerald Britton, Pluralsight courses
September 4, 2015 at 7:54 am
I have been using this method for a while now, and I include a WAITFOR DELAY '00:00:02' between each loop for the delete. This will ensure that if there are any waiting processes they will take the resource during this wait period and do whet it needs to do.
If you are not concerned about contention, disable the indexes on the table, the delete will happen a lot quicker!
September 4, 2015 at 12:20 pm
g.britton (9/4/2015)
P Jones (3/6/2013)
Itzik Ben-Gan teachesWHILE 1=1
BEGIN
DELETE TOP (5000) FROM table WHERE .....
IF @@ROWCOUNT < 5000 BREAK
END
and you can add a WAITFOR to delay between iterations
Adjust the 5000 to suit your table.
+1
+1
September 7, 2015 at 3:16 am
There is a theoretical possibility that this loop never ends if we don't predetermine the loops in advance.
September 7, 2015 at 5:42 am
carbogast (3/6/2013)
I did something very similar which is working great. The commit after each small batch is the real secret. I also put a WAITFOR DELAY '00:00:02' into the loop which seems to help keep the purge stored procedure from encroaching on the overall server performance.
I don't think there is any benefit to adding WAITFOR DELAY as any processes will be able to jump in as soon as each transaction finishes. A better option would be to reduce the batch size if there is any problem with contention.
Viewing 15 posts - 76 through 90 (of 95 total)
You must be logged in to reply to this topic. Login to reply