March 22, 2010 at 4:31 am
I have inherited a fairly large data warehouse implemented with SQL server 2005 Enterprise edition. The reason I am posting in the SQL 2008 forum is because we are working to upgrade it to 2008 enterprise.
The database is partitioned and we use one filegroup for each month. There are also different filegroups for different purposes. In total we have over 200 filegroups. The filegroups are created every month with a little script that creates all the new files for the coming month, and adds them to the database.
The files are created with a fairly large size that is intended to be enough so the files do not have to grow.
My problem is that there is now a lot of empty space in all the files. Most files still have their original created size - they have never grown at all.
The total size of all files is over 1100 GB. The total amount of allocated data in the files is 820 GB, so I might be able to reclaim about 300 GB.
I would like to recover this unused disk space by shrinking all the files. My plan is to first shrink the files and then copy them to a new machine with SQL server 2008. The reason I want to shrink the file is both to reclaim diskspace but also to reduce the amount of data that we need to copy over the fairly slow network between the machines.
According to this article: http://www.sqlskills.com/BLOGS/PAUL/post/Why-you-should-not-shrink-your-data-files.aspx shrinking files causes massive index fragmentation which I would like to avoid.
What I would really want to do is to avoid moving the allocated pages in all the files, and just cut of the unused parts of the files - basically use something like DBCC SHRINKFILE TRUNCATEONLY
However, as far as I understand TRUNCATEONLY will never shrink a file below the size it had when it was created, so that will not really work.
Do anyone have any suggestions for how to tackle this problem ?
/SG
March 22, 2010 at 6:14 am
You could create new partitions in Filegroups of appropriate sizes, switch the partitions to the new ones then drop the old filegroups...
Having said that, can data for previous months ever be updated? If so, is it really worth completely trimming down all of the unused space?
Could you speed up the migration of data across the network by using a 3rd party backup compression tool and further eliminate downtime by log shipping the data after the final full backup?
March 23, 2010 at 6:24 am
There is no direct way to do exactly what you want, since TRUNCATEONLY never shrinks the file below its creation size, as you said.
Your best choice depends on the data. One option is to create a new file of the correct size in the existing file group, then transfer the data between files using the EMPTYFILE option. If all goes well, you can then drop the old file. Check fragmentation levels afterward, and rebuild indexes if necessary. I seem to remember that EMPTYFILE tries to avoid fragmentation, but you should test that.
If the data on the filegroup is split between relatively small objects, you could shrink the file, and then rebuild the indexes afterward. Depending on the data, you may or may not not end up saving much space this way.
Finally, remember that once you upgrade to 2008 Enterprise, you will have the option to use ROW or PAGE level compression. I would be tempted to install a Developer or Trial Edition, and experiment to see what impact this will have on your file group sizes and design.
March 25, 2010 at 5:01 pm
Paul, thank you for the idea of using EMPTYFILE to move all the data to a new file.
My current plan is to use backup/restore to move the data to the new server, enable PAGE compression, create new files with the correct sizes, use EMPTYFILE to move the data to the new files and then delete the old (now empty) files.
It seems (according to my tests) that just enabling PAGE compression will shrink the database with about 70% with our data.
So, after this is finished it seems like I will only need about 30% * 800 = 240 GB instead of todays 1100 GB.
Hopefully performance will also be improved because there is less data to read from disk.
Not a bad result for a fairly simple upgrade.
/SG
March 25, 2010 at 10:42 pm
Just thought of a much better option than using EMPTYFILE: use the CREATE INDEX ... WITH (DROP_EXISTING = ON) ... ON {new filegroup} syntax. This will move the data and defragment it at the same time. EMPTYFILE may be bad for fragmentation - I forget the details of how it works internally, but it is probably the same algorithm as the shrink file routines.
Thanks for the feedback though - it is appreciated.
March 26, 2010 at 2:39 am
Hmm. I see your point.
The problem is that if I do this with CREATE INDEX I will probably have to create new filegroups and new partitioning schemes using the new filegroups, then all partitions for a specific index would be moved at the same time.
I think this will be more work for me (creating all the scripts) and also each CREATE INDEX will probably take a lot of time (There are many millions of rows in the tables)
There are currently about 200 different filegroups in this database, there are about 50 partitioned tables with about 5 indexes each so I need a method that is easy to script and that is as efficient as possible.
I think I will try the EMPTYFILE option first. It seems simpler and I can do it one file at a time.
I will verify that there are no problems with index fragmentation before doing this in production.
If EMPTYFILE creates any fragmentation it should be possible to fix it efficiently with ALTER INDEX REORGANIZE.
Note that immediately before using EMPTYFILE I will have rebuilt all indexes completely using PAGE compression so the indexes will be very nicely non-fragmented before EMPTYFILE.
If I understand correctly REORGANIZE will only move complete pages around in its compressed form so it should be perfect for fixing any problems that might be caused by EMPTYFILE.
March 26, 2010 at 3:08 am
Ah, yes of course. It's been a couple of days and I had forgotten the context! Maybe in some future released ALTER INDEX will be expanded (again) to allow individual partitions to be rebuilt and moved at the same time. Looks like EMPTYFILE and REORGANIZE will work best for you, yes.
Stefan_G (3/26/2010)
...If I understand correctly REORGANIZE will only move complete pages around in its compressed form so it should be perfect for fixing any problems that might be caused by EMPTYFILE.
Yep. Fully logged though, of course.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply