July 1, 2011 at 8:23 am
I actually had to do this about five months ago. Initially I tried to delete them all in one operation. The log file got huge, filled up the hard drive, and the server hung. Then on reboot I had to wait a couple of hours for the progress the query had made before the server hung to get rolled back. Oops.
What I wound up doing was writing two stored procedures. One deleted a fairly small number of records - 10,000 or so if I recall. Enough to be sizeable, but not so much that it would slow things down even during periods of high usage. The other was the same, but with a larger batch size. That one was 150,000.
I scheduled these to run every fifteen minutes. The small one during business hours, and the large during low usage times.
We're on an older version of SQL Server so I couldn't use DELETE TOP as suggested above. I had to do something like
DELETE FROM records WHERE ID IN (
SELECT TOP 10000
FROM records WHERE record_date < getdate()-180
)
July 1, 2011 at 8:44 am
Phil Melling (7/1/2011)
3) If possible, put the database into SIMPLE backup mode so that the deletes are not logged. (Make SURE to switch it back afterwards!)
No! Don't put it in to SIMPLE! Just because the database is in SIMPLE mode does NOT mean that the deletes are not logged - they are. It's just that the txlog is truncated once a checkpoint occurs. So it won't save you on an expanding transaction log. Plus you will have to take a full database backup immediately afterwards because your log sequence will be out.
I'm sorry, but putting it in to simple is bad advice.
I should have been more precise.
Putting the database into the SIMPLE recovery model does indeed pose a significant threat. I was under the assumption that the database would not be in production when the delete is running. One of the reasons I prefaced the advice with "If possible." According to MS SIMPLE Recovery Mode:
Automatically reclaims log space to keep space requirements small, essentially eliminating the need to manage the transaction log space.
With the above caveats, it is quite reasonable to place the database into SIMPLE recovery mode, since the only information that would be lost in the event of a disaster would be the deletes themselves.
Now, if the database is in production during the delete operation, then yes, putting it into SIMPLE mode is definitely bad advice. Also, I (really, really, really) should have included a full backup afterwards as a requirement. No more pre-coffee advice from me!
July 1, 2011 at 8:53 am
I think it was the advice that deletes were not logged in simple mode which is a major mis-assumption about simple mode, that caused the most concern..
July 1, 2011 at 4:50 pm
dhearne 25776 (7/1/2011)
Phil Melling (7/1/2011)
3) If possible, put the database into SIMPLE backup mode so that the deletes are not logged. (Make SURE to switch it back afterwards!)
No! Don't put it in to SIMPLE! Just because the database is in SIMPLE mode does NOT mean that the deletes are not logged - they are. It's just that the txlog is truncated once a checkpoint occurs. So it won't save you on an expanding transaction log. Plus you will have to take a full database backup immediately afterwards because your log sequence will be out.
I'm sorry, but putting it in to simple is bad advice.
I should have been more precise.
Putting the database into the SIMPLE recovery model does indeed pose a significant threat. I was under the assumption that the database would not be in production when the delete is running. One of the reasons I prefaced the advice with "If possible." According to MS SIMPLE Recovery Mode:
Automatically reclaims log space to keep space requirements small, essentially eliminating the need to manage the transaction log space.
With the above caveats, it is quite reasonable to place the database into SIMPLE recovery mode, since the only information that would be lost in the event of a disaster would be the deletes themselves.
Now, if the database is in production during the delete operation, then yes, putting it into SIMPLE mode is definitely bad advice. Also, I (really, really, really) should have included a full backup afterwards as a requirement. No more pre-coffee advice from me!
Thank you for taking the time to write such a detailed reply.
July 1, 2011 at 8:47 pm
Please bear with me, I'll do my best to be kind. This is more clarification of the verbage from what I read then a bash on the intent, which is what I heard. The difference being I knew what to interpret it as, which I fear newbies may not have enough experience to know the difference.
dhearne 25776 (7/1/2011)
With the above caveats, it is quite reasonable to place the database into SIMPLE recovery mode, since the only information that would be lost in the event of a disaster would be the deletes themselves.
This is inaccurate due to a specific scenario, and that scenario is called the log chain. If you place a database into simple mode, from that moment on you can no longer perform a point in time restore until the next full backup is taken after you turn it back into a full or bulk recovery mode. Which leads us to point 2...
Now, if the database is in production during the delete operation, then yes, putting it into SIMPLE mode is definitely bad advice. Also, I (really, really, really) should have included a full backup afterwards as a requirement. No more pre-coffee advice from me!
Which covers point 1, but not with enough specifics as to why, which I feel people should know. In general, if you're going to play with recovery modes, there are two things you should strive to do. Swap between Bulk and Full, and know when and why, or go to Simple and realize you're leaving yourself out to dry.
A large log is not your enemy, but there are ways to minimize the impact. In particular, look into minimally logged transactions.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
July 1, 2011 at 8:51 pm
July 3, 2011 at 2:18 am
Phil Melling (7/1/2011)
I'm sorry, but putting it in to simple is bad advice.
I should probably add that it's not my intention to be rude or to put people off from making their comments or suggestions. It takes a brave person to stick their neck out on these forums and offer their advice when there are a lot of database professionals out there with a wealth of knowledge and experience.
But I do think if anyone is going to offer advice it should be with caution and with consideration to the potential consequences - which is why clarification is always important. After all, I'd hate to be the person who offered advice, that someone took, and it left their database in a state that was compromised.
July 3, 2011 at 8:54 am
Phil Melling (7/3/2011)
Phil Melling (7/1/2011)
I'm sorry, but putting it in to simple is bad advice.
I should probably add that it's not my intention to be rude or to put people off from making their comments or suggestions. It takes a brave person to stick their neck out on these forums and offer their advice when there are a lot of database professionals out there with a wealth of knowledge and experience.
But I do think if anyone is going to offer advice it should be with caution and with consideration to the potential consequences - which is why clarification is always important. After all, I'd hate to be the person who offered advice, that someone took, and it left their database in a state that was compromised.
I agree with you Phil!
July 8, 2011 at 12:18 am
Syed Jahanzaib Bin hassan (6/30/2011)
create procedure DeleteOlddata_SPas
Begin
Delete top(10000)
from SC_SampledNumericDataFact_Table
where DatetimeAdded<getdate()-180
End
And schedule this Store Procedure on the SQL Server Agent job with minimum interval of 15 mins,when you perform the DML,SQL Server generate the locks against the desire table so break your records in a batch of 10000 or 1000,it will delete records fastly without interruption of any huge performance effect on the production server
For me: best answer, I have deleted almost 3 billion records using this approach without any problems.
July 9, 2011 at 8:27 am
This is inaccurate due to a specific scenario, and that scenario is called the log chain. If you place a database into simple mode, from that moment on you can no longer perform a point in time restore until the next full backup is taken after you turn it back into a full or bulk recovery mode. Which leads us to point 2...
Now, if the database is in production during the delete operation, then yes, putting it into SIMPLE mode is definitely bad advice. Also, I (really, really, really) should have included a full backup afterwards as a requirement. No more pre-coffee advice from me!
Which covers point 1, but not with enough specifics as to why, which I feel people should know. In general, if you're going to play with recovery modes, there are two things you should strive to do. Swap between Bulk and Full, and know when and why, or go to Simple and realize you're leaving yourself out to dry.
A large log is not your enemy, but there are ways to minimize the impact. In particular, look into minimally logged transactions.
You are right. SIMPLE mode certainly breaks the log chain. However, there is no reason to log 10 million deletes. Let's say I have taken my database out of production and prevented user or application access. I then take a full backup, and place it into simple mode. I then initiate the delete procedures as appropriate to my situation. Part way through, the server crashes. So, I rebuild the server and restore my database using the last full backup. Now, I cannot roll through the transaction logs to redo the deletes I have lost because the database was in SIMPLE mode when it crashed. I fail to see the problem with that. Since rolling through the transaction logs (if they were available) is equivalent to simply re-running the delete command.
Now, let's say that the delete goes successfully, without a crash. After it is done I need to set the backup mode to FULL and do a full backup again, restarting the log chain.
So, yes, the log chain is broken for the period when the database is in simple mode. However, the deletes in that time frame are a) not important to the business and b) can be easily reproduced in the event of a disaster. These two things make the logging of the operations unnecessary overhead. Especially given that the OP has space issues.
Now, does this mean that this is something you should do on a regular basis? No. Fiddling with the transaction log is full of potential problems. Further, fiddling with backup mode without an understanding of the consequences of what you are doing is even worse.
An educated and knowledgeable DBA is armed for that kind of activity specifically because s/he does know what the potential consequences are.
July 9, 2011 at 3:26 pm
dhearne 25776 (7/9/2011)
These two things make the logging of the operations unnecessary overhead.
deletes are still logged in simple mode, as are all the transactions.
July 9, 2011 at 4:53 pm
steveb. (7/9/2011)
dhearne 25776 (7/9/2011)
These two things make the logging of the operations unnecessary overhead.deletes are still logged in simple mode, as are all the transactions.
The point is not whether or not something is logged, but whether you can recover to a specific point. If the database is in SIMPLE mode, you cannot.
July 9, 2011 at 5:24 pm
damen.mis (7/8/2011)
Syed Jahanzaib Bin hassan (6/30/2011)
create procedure DeleteOlddata_SPas
Begin
Delete top(10000)
from SC_SampledNumericDataFact_Table
where DatetimeAdded<getdate()-180
End
And schedule this Store Procedure on the SQL Server Agent job with minimum interval of 15 mins,when you perform the DML,SQL Server generate the locks against the desire table so break your records in a batch of 10000 or 1000,it will delete records fastly without interruption of any huge performance effect on the production server
For me: best answer, I have deleted almost 3 billion records using this approach without any problems.
Not with only 10,000 rows every 15 minutes you didn't... 😉
3,000,000,000Rows Claimed to have been deleted
10,000Divided by Rows per Deletion
300,000= Total Deletion "Episodes"
0.25Times Hours per Deletion "Episode"
75,000= Total Elapsed Time for all Deletion "Episodes" in Hours
8766Divided by Total hours in average year.
8.55578371= Total number of years to delete
--Jeff Moden
Change is inevitable... Change for the better is not.
July 9, 2011 at 5:39 pm
dhearne 25776 (7/9/2011)
You are right. SIMPLE mode certainly breaks the log chain. However, there is no reason to log 10 million deletes. Let's say I have taken my database out of production and prevented user or application access. I then take a full backup, and place it into simple mode.
Where in this discussion did the OP say the database was being taken out of production usage? My production DBs pretty much run 24/7. When I need to do mass deletes, I'm not necessarily worried about the deletes being logged, I'm worried about everything else that's going on being logged... particularly if my delete crashes the server.
Now, let's say that the delete goes successfully, without a crash. After it is done I need to set the backup mode to FULL and do a full backup again, restarting the log chain.
Absolutely, IF you take the DB offline from users.
a) not important to the business
An unimportant delete? You must not run things with a lot of auditing. I rarely have unimportant deletes, unless a load went in badly... and even then I have an audit of the bad load in case anything was run in the meanwhile against the bad data and needs either proof or researchability to correct.
b) can be easily reproduced in the event of a disaster.
Again, I agree here, but ONLY if you make the database unavailable to the end user. In most circumstances maintenance items like this should not require database downtime.
An educated and knowledgeable DBA is armed for that kind of activity specifically because s/he does know what the potential consequences are.
I agree. The OP here is obviously NOT an educate and knowledgable DBA, simply due to the nature of the request, nor may be other people who google this thread looking for a solution. With the explanation you just gave, with the caveat of taking the database completely out of any end user's hands, yes, this is a safe method. It's overkill but does not leave you open to losing records because of the simple mode of the transaction log... mostly because said transaction log is unimportant if no other users are in the system and only *1* set of activities is occurring, and you're doing full back ups after each action.
However, at that point, you really need to get the DB into single-user mode. It's unlikely any DBA is going to be able to turn off all SQLAgents across the company which also may be trying to affect the database at that time.
So, my apologies if my intent to clarify your explanation annoyed you.
Edit: Removed possible flippant remark.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 14 posts - 16 through 28 (of 28 total)
You must be logged in to reply to this topic. Login to reply