March 2, 2009 at 11:19 pm
We have a several transactional tables in an OLTP database which accumulate 10-30 million rows each year running on a SAN-connected SQL Server 2005 Cluster.
We are required to keep data for 18 months.
Traditionally we have deleted anything older than 18 months each time we have planned downtime on the system (once every 3-4 months) - but in an attempt to minimize downtime we have now made a stored procedure that runs as an online purge job several times each day - deleting 5000 rows from each of the tables in question each time.
My question is wether this is a bad idea performancewise? I assume this continous deletion of data may increase fragmentation and thus decrease performance as opposed to deleting a lot of data every 3-4 months.
We run online defragmentation daily.
Note: Most tables have clustered index on date, so the deletion will occur at the end of the clustered index, but there are also other non-clustered indexes which I assume will get fragmented and need reindexing.
Thanks in advance for any thoughts and comments 🙂
Thorbjorn Kvam,
Project manager and DBA (design) at Payex (http://www.payex.com)
March 3, 2009 at 6:10 am
I would say both approaches are valid. You need to do what your business defines. You're already on the right track, make sure you update the statistics and defrag the indexes. Do it in that order though, you don't want to do an index rebuild and then have some sampled statistics mess up the perfect set that will come out of the rebuild. I'd say the same thing if you were doing it quarterly except you would need to plan for it being more severe and taking longer.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 3, 2009 at 10:21 am
I would be more worried about the potential for blocking and locking that these intraday deletes may cause. My preference would always be to run the deletes during maintenance windows or off-peak times - e.g. late evening/ weekend but then I do not know the nature of your business.
A better option, if you have the time, would be to partition your big tables. Switching out partitions only takes a few seconds regardless of the number of rows. Partitioned Views are a further option if you dont have the luxury of ENT edition. It would appeaer your data lends itself to a classic 'sliding window' scenario.
thanks
SQL_EXPAT
March 3, 2009 at 10:44 am
Hey, thanks for the replies.
We have the enterprise edition - and runs an online payment gateway, so we have traffic around the clock - no real "slow" period. The operation either has to be safe to run online with traffic, or we have to do it offline (which means 3-4 times each year).
As we are deleting 18 month old data, I don't see how this could cause blocking as long as we don't delete so big chunks that we get a table lock? There is no activity on data that old.
Feel free to correct me if I'm mistaken on how this works. 🙂
Partioning older data may be a good idea to prevent locking, if locking is indeed a possible issue. We looked into partioning tables a year or two back - but it seemed a bit of a hassle to get it fully automated. I'll look into it some more. As we run mission critical stuff where stability is the main issue in both application design and operations, we try to Keep It Simple(tm) 🙂
Thorbjorn Kvam,
Project manager and DBA (design) at Payex (http://www.payex.com)
March 3, 2009 at 10:50 am
I was at a client with a similar situation. For their application, constantly searching for data across tables such as this in the OLTP environment resulted in poor performance; they had to delete old data as well. This is where a cursor or min-while can help you. Since you don't dare do a mass delete on it, create a process that fetches only a few rows to delete, start it, and let it go for however long it takes. (Normally I shudder at cursors or procedural-type routines, but this is where they come in real handy). For a min-based while, you can vary the number of rows, "chunk size" so to speak, and delete from @start to @end. From there you can monitor any locks/blocks, and up the size of the rows to delete if it finishes in a reasonable time without affecting users, usually 1-3 seconds depending on the number of transactions/sec in the database.
March 3, 2009 at 11:38 am
ab5sr (3/3/2009)
I was at a client with a similar situation. For their application, constantly searching for data across tables such as this in the OLTP environment resulted in poor performance; they had to delete old data as well. This is where a cursor or min-while can help you. Since you don't dare do a mass delete on it, create a process that fetches only a few rows to delete, start it, and let it go for however long it takes. (Normally I shudder at cursors or procedural-type routines, but this is where they come in real handy). For a min-based while, you can vary the number of rows, "chunk size" so to speak, and delete from @start to @end. From there you can monitor any locks/blocks, and up the size of the rows to delete if it finishes in a reasonable time without affecting users, usually 1-3 seconds depending on the number of transactions/sec in the database.
Hm, well fair enough - but I'm still not convinced that I will ever get a locking issue as long as I'm only deleting old records (which are physically sorted on disk by date due to the clustered index on a "created" column) ? We're talking about removing a few thousand rows from the end of a table that have several million records.
I'm deleting 5000 rows from several tables - this process executes 12 times a day, and each run lasts 15-30 seconds, so a few seconds per table. So my reasoning goes like this - If someone happens to list details or whatever from a transaction that's 18 months old (very unlikely) while the job is running, they may get slowed down by a few seconds.
This is entirely okay - the main focus is transaction throughput for NEW transactions which shouldn't be affected by any locking due to this process..(?)
And if there is no locking issue, the only thing I will acheive with the cursor approach is slowing down the process?
Thorbjorn Kvam,
Project manager and DBA (design) at Payex (http://www.payex.com)
March 3, 2009 at 12:09 pm
The cursory approach is to only delete a row or a few rows at a time. In theory you would rarely if ever have a problem with this method given the number of rows in your table. It 1) decreases the chance, 2) minimizes key or index locks (or any other locking), 3) is gentle on the log. Unfortunately, again, you'll have to run it and let it go much longer. We did it on the weekend, say Friday night kick it off and let it go until Monday early morning.
March 3, 2009 at 12:25 pm
It would be better to run the deletes just before you do the daily defrag and statistics update.
If it only takes a 30 seconds to delete each batch of 5000, run them one after another until done. If you are concerned about server impact, insert a WAITFOR DELAY in each iteration of the loop to spread the deletes out over a longer period of time. Delete 5000, wait 60 seconds, delete 5000, wait 60 seconds, and so on. Your 12 daily deletes would only take about 18 minutes.
March 3, 2009 at 12:29 pm
Ab5sr, Thanks for your suggestion - I guess it might be worth a try to minimize the potential performance hit. I'll give this a try in our staging environment and see how it looks.
I'm not a big fan of cursors myself, but I know we've used a similar approach when we needed to delete or move data online without affecting the system in the past.
I'm afraid we don't have the luxury of reduced traffic during weekends. 🙂
Thorbjorn Kvam,
Project manager and DBA (design) at Payex (http://www.payex.com)
March 10, 2009 at 4:32 pm
try running it once a week during a known period of low or no activity!!
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
March 10, 2009 at 4:46 pm
[font="Verdana"]No need to use cursors for batches: top works nicely.
I would think that moving to partitioning would be the best approach. It would increase performance accessing your data, and you could just drop the oldest partition once a month.
Yes, it's a bit of fiddling to get it working propertly and fully automated. But I think the gain in performance would be worth it.
[/font]
March 13, 2009 at 12:25 pm
Partioning older data may be a good idea to prevent locking, if locking is indeed a possible issue. We looked into partioning tables a year or two back - but it seemed a bit of a hassle to get it fully automated. I'll look into it some more. As we run mission critical stuff where stability is the main issue in both application design and operations, we try to Keep It Simple(tm) [Smile]
Locking should be a minor issue here if at all, real issues are slower operations on one huge table, unnecessary transaction log growth from deletes and server slowdown during huge deletes.
Partitioning is designed precisely for such problems.
Just as Bruce says, it's bit of fiddling, but a proper path to go.
It's well explained: http://msdn.microsoft.com/en-us/library/ms188730.aspx
Besides, you can archive old partitions and in case of need put them back online, if you delete data, you'll have a lot of work to get access to old data.
March 13, 2009 at 12:40 pm
I have to agree that partitioning is designed for this.
My biggest concern with the size of the deletes being done would be the transaction log. Too much at once might slam that and cause growth problems.
Generally, when I have to figure out what works best with this kind of thing, I just guess at first, and then refine the data with further iterations.
Try deleting 5k rows at a time. See how long it takes to get caught up.
Then try deleting 10k at a time. See what the total time is for that.
Then try 1k at a time.
If 5k is the fastest, 10k is slightly slower, and 1k is too slow, try 6k or 7k. If they're better than 5k, try 8k, if that's slower, then 6k or 7k (whichever was fastest) is probably the way to go. Run it a few times, see if it's still fast the next time you run the process.
That approach has worked for me pretty well. It's based on how the ancient Greeks built mathematics, which seems to have worked out okay for that subject.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 13, 2009 at 1:17 pm
Thanks for all your thoughts and comments. Currently it's working out okay when we delete 5000 rows performance wise, and there doesn't appear to be any locking issues. I assume the biggest issue (as some of you have mentioned) with this method is the increased transaction log size.
I'm very pressed for time at the moment, but we'll absolutely look into the partioning alternative when we get the time - hopefully I'll find someone else in our team that can look into it for me. 😉
Thorbjorn Kvam,
Project manager and DBA (design) at Payex (http://www.payex.com)
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply