I saw a question recently where a user was attempting to rebalance the data in files in a filegroup with DBCC SHRINKFILE(). That felt like a bad idea, but I thought I'd conduct a little experiment and see if the technique would work. This is a short writeup of what happened.
Scenario
The scenario was one that many people might find themselves in. A database has multiple files in a filegroup, but the files are different sizes, with different autogrowths. Over time, one file might be much larger than others. Eventually a DBA looks to clean this up and try to ensure more event growth and distribution of data.
While you can move data among filegroups, we can't control how data moves in files. SQL Server uses a proportional fill algorithm to allocate pages. This should result in files that are more empty receiving more pages than files that are more full.
DBCC SHRINKFILE with the EMPTYFILE parameter will move all data out of a file into other files in the filegroup. Once this is complete, the empty file is essentially marked as read-only, preventing new data from being added. If the DBCC operation is cancelled, all work is retained, and this article is an experiment to see if the flag is set.
Let's Build a Database
I started by creating a small database and then adding some data to it. Here's the initial code. Remember that your paths might be different than mine.
CREATE DATABASE [EmptyFileTest] CONTAINMENT = NONE ON PRIMARY ( NAME = N'EmptyFileTest', FILENAME = N'E:\SQLServer\MSSQL13.SQL2016\MSSQL\Data\EmptyFileTest.mdf' , SIZE = 8192KB , FILEGROWTH = 4096KB ) LOG ON ( NAME = N'EmptyFileTest_log', FILENAME = N'E:\SQLServer\MSSQL13.SQL2016\MSSQL\Data\EmptyFileTest_log.ldf' , SIZE = 8192KB , FILEGROWTH = 65536KB ) GO USE EmptyFileTest GO CREATE TABLE BucketForStuff ( MyID INT IDENTITY(1,1) NOT NULL , MyChar1 VARCHAR(8000) , CONSTRAINT BucketForStuffPK PRIMARY KEY (MyID) )
I wanted to get some sizes of files, and decided to use this short query. This looks at the FILEPROPERTY to get the number of pages and then divides by 128 to get the MB value.
SELECT [Segment Name] = RTRIM(name) , AllocatedMB = CAST(size / 128.0 AS DECIMAL(10, 2)) , UsedMB = CAST(FILEPROPERTY(name, 'SpaceUsed') / 128.0 AS DECIMAL(10, 2)) , AvailableMB = CAST(size / 128.0 - (FILEPROPERTY(name, 'SpaceUsed') / 128.0) AS DECIMAL(10, 2)) , PercentUsed = CAST((CAST(FILEPROPERTY(name, 'SpaceUsed') / 128.0 AS DECIMAL(10, 2)) / CAST(size / 128.0 AS DECIMAL(10, 2)) ) * 100 AS DECIMAL(10, 2)) FROM sysfiles ORDER BY groupid DESC
My inital set of data shows this for my database.
Segment Name | AllocatedMB | UsedMB | AvailableMB | PercentUsed |
EmptyFileTest | 8.00 | 2.65 | 5.44 | 32.00 |
Let's now add some data. I used a simple tally table to create 500 rows in my table and fill out a portion of the file. I'll show the code and then the new sizes. There was an autogrowth here, a single 4MB growth.
WITH myTally(n) AS (SELECT n = ROW_NUMBER() OVER (ORDER BY (SELECT null)) FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) a(n) CROSS JOIN (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) b(n) CROSS JOIN (VALUES (1), (2), (3), (4), (5)) c(n) ) INSERT dbo.BucketForStuff (MyChar1) SELECT REPLICATE(CHAR((n % 26)+ + 64), 8000) FROM myTally
Segment Name | AllocatedMB | UsedMB | AvailableMB | PercentUsed |
EmptyFileTest | 12.00 | 6.63 | 5.38 | 55.25 |
The next step is to add add a new file. Let's add a new, slightly smaller file so that we can get different sizes. I'll add a 4MB file with a 2MB autogrowth.
ALTER DATABASE EmptyFileTest ADD FILE ( NAME = N'SecondFile' , FILENAME = N'E:\SQLServer\MSSQL13.SQL2016\MSSQL\Data\SecondFile.ndf' , SIZE = 4096KB , FILEGROWTH = 2048KB ) TO FILEGROUP [PRIMARY];
We will add more data, and then once again check the space. I'll use the same query, but add a few items to the tally table so that we add 1000 rows this time.
WITH myTally(n) AS (SELECT n = ROW_NUMBER() OVER (ORDER BY (SELECT null)) FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) a(n) CROSS JOIN (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) b(n) CROSS JOIN (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) c(n) ) INSERT dbo.BucketForStuff (MyChar1) SELECT REPLICATE(CHAR((n % 26)+ + 64), 8000) FROM myTally
This has added about 8MB to the database, which is split among two tables. As we can see, the second file is full, and another autogrowth occurred on the first file.
Segment Name | AllocatedMB | UsedMB | AvailableMB | PercentUsed |
EmptyFileTest | 16.00 | 10.88 | 5.13 | 68.00 |
SecondFile | 4.00 | 4.00 | 0.00 | 100.00 |
Let's repeat this one more time, with a third file. This time I'll make this one smaller, so that we have multiple different file sizes.This time we'll add a 2MB file with 1MB autogrow.
ALTER DATABASE EmptyFileTest ADD FILE ( NAME = N'ThirdFile' , FILENAME = N'E:\SQLServer\MSSQL13.SQL2016\MSSQL\Data\ThirdFile.ndf' , SIZE = 2048KB , FILEGROWTH = 1024KB ) TO FILEGROUP [PRIMARY];
The space is now uneven in the files. If I check the sizes, I see this:
Segment Name | AllocatedMB | UsedMB | AvailableMB | PercentUsed |
EmptyFileTest | 16.00 | 10.88 | 5.13 | 68.00 |
SecondFile | 4.00 | 4.00 | 0.00 | 100.00 |
ThirdFile | 2.00 | 0.06 | 1.94 | 3.00 |
I'll add 10,000 rows this time, which should cause some autogrowth to occur.
WITH myTally(n) AS (SELECT n = ROW_NUMBER() OVER (ORDER BY (SELECT null)) FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) a(n) CROSS JOIN (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) b(n) CROSS JOIN (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) c(n) CROSS JOIN (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) d(n) ) INSERT dbo.BucketForStuff (MyChar1) SELECT REPLICATE(CHAR((n % 26)+ + 64), 8000) FROM myTally GO
Once I do this, I'll see that I've mostly filled my database. My first file didn't autogrow, but the second one did. The proportional fill algorithm is slowly trying to balance out the files.
Segment Name | AllocatedMB | UsedMB | AvailableMB | PercentUsed |
EmptyFileTest | 56.00 | 56.00 | 0.00 | 100.00 |
SecondFile | 24.00 | 24.00 | 0.00 | 100.00 |
ThirdFile | 13.00 | 12.94 | 0.06 | 99.54 |
At this point, I've got different sized files, one of which is much larger than the others. Let's rebalance.
Using DBCC SHRINKFILE
I want to run this command now
DBCC SHRINKFILE(EmptyFileTest, EMPTYFILE)
Before I do this, I need to resize the files to be the same size. Ultimately I want to ensure that moving forward we have a more event proportional fill. I can use ALTER DATABASE to reset the sizes and the autogrow settings.
USE [master] GO USE [master] GO ALTER DATABASE [EmptyFileTest] MODIFY FILE ( NAME = N'SecondFile', SIZE = 57344KB , FILEGROWTH = 4096KB ) GO ALTER DATABASE [EmptyFileTest] MODIFY FILE ( NAME = N'ThirdFile', SIZE = 57344KB , FILEGROWTH = 4096KB ) GO USE EmptyFileTest GO
Now I have this space query:
Segment Name | AllocatedMB | UsedMB | AvailableMB | PercentUsed |
EmptyFileTest | 56.00 | 56.00 | 0.00 | 100.00 |
SecondFile | 56.00 | 24.00 | 32.00 | 42.86 |
ThirdFile | 56.00 | 12.94 | 43.06 | 23.11 |
The next step is to run DBCC. I'll run the command above by highlighting the code and clicking "Execute" in SSMS. A second later, I'll click the cancel button that's adjacent to Execute. This should cancel the command. This is a small database, so you don't want to wait too long to cancel.
Once I do that, I see this in my space query. Note, depending on how long to wait to cancel and the speed of your system, you might see different results.
Segment Name | AllocatedMB | UsedMB | AvailableMB | PercentUsed |
EmptyFileTest | 56.00 | 52.06 | 3.94 | 92.94 |
SecondFile | 56.00 | 24.06 | 31.94 | 42.96 |
ThirdFile | 56.00 | 16.88 | 39.13 | 30.14 |
Now the big question. Can I add data to EmptyFileTest? If I had let the DBCC complete, I'd have found the other files with data, and EmptyFileTest mostly clear. However, when I added data, no data would be added to this file. This is to be sure I have time to actually remove the file from the filegroup.
Let's add more data.
WITH myTally(n) AS (SELECT n = ROW_NUMBER() OVER (ORDER BY (SELECT null)) FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) a(n) CROSS JOIN (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) b(n) CROSS JOIN (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) c(n) ) INSERT dbo.BucketForStuff (MyChar1) SELECT REPLICATE(CHAR((n % 26)+ + 64), 8000) FROM myTally GO
And check the space.
Segment Name | AllocatedMB | UsedMB | AvailableMB | PercentUsed |
EmptyFileTest | 56.00 | 55.94 | 0.06 | 99.89 |
SecondFile | 56.00 | 24.19 | 31.81 | 43.20 |
ThirdFile | 56.00 | 20.75 | 35.25 | 37.05 |
Data has been added to all files.
Conclusion
The read only flag for a file isn't set until somewhere near the end of the DBCC operation. If you cancel this, not only is data moved, but you can still add data to the file that was specified in the DBCC SHRINKFILE() command.
I wouldn't recommend this as a technique to balance out the data in a file, but I can't deny that it can work. Certainly you don't have control over how quickly the data moves, and if you wait too long, you might end up with an empty file that you need to remove from your database.
Reference
Proportional Fill - https://docs.microsoft.com/en-us/sql/relational-databases/pages-and-extents-architecture-guide#ProportionalFill
DBCC SHRINKFILE - https://docs.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-shrinkfile-transact-sql
FILEPROPERTY() - https://docs.microsoft.com/en-us/sql/t-sql/functions/fileproperty-transact-sql