Multiple files in filegroup data distribution

Question

Multiple files in filegroup data distribution

Pam Brisjar

SSChampion

Points: 12094
More actions
June 7, 2011 at 12:00 pm

#251459

Just looking for a sanity check / confirmation of a theory...
Within a DB I have a filegroup with multiple files. For some reason the files were set up with disproportionate initial sizes. I suspect this is due to the fact that the additional files were added some time after the initial DB creation. (Inherited DB that's been in production use for several years.)
It looks like the data is being distributed via %age, so that at this point all the files are at roughly 64% full/used. (They are all set to unrestricted autogrow by MB).
My untested, not really researched thought had always been that the distribution would be evenly split by size (so that each file would get roughly the same amount of data added) but that doesn't appear to be so.
This means that for more even distribution amongst the files I would need to have the files of the same size. Yes? So if I bump up the small files (and decrease the large file) so that they're the same size, would the distribution even out as data is added or would it stay skewed?
Experienced opinions appreciated. I'd love to run some tests / proofs of concept but don't really have the resources to do so.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

eccentricDBA SSCarpal Tunnel Points: 4844 More actions · Answer 1

I believe this is what you are looking for.

"SQL Server 2000 files can grow automatically from their originally specified size. When you define a file, you can specify a growth increment. Each time the file fills, it increases its size by the growth increment. If there are multiple files in a filegroup, they do not autogrow until all the files are full. Growth then occurs using a round-robin algorithm."

source: http://msdn.microsoft.com/en-us/library/aa174545(SQL.80).aspx

Pam Brisjar SSChampion Points: 12094 More actions · Answer 2

Nope.

As a minor point, I'm in SQL Server 2005, not 2000. (Though I'm fairly certain this behavior remains the same.)

But that's also not the question I am asking. What I am wondering about is how SQL Server decides to split the actual data stored amongst the various files in a filegroup.

ricer Right there with Babe Points: 722 More actions · Answer 3

SQL 2005 BOL doesn't specify but this is from SQL 2000 so I believe is has not changed.

Filegroups use a proportional fill strategy across all the files within

each filegroup. As data is written to the filegroup, the SQL Server Database

Engine writes an amount proportional to the free space in the file to each

file within the filegroup, instead of writing all the data to the first file

until full. It then writes to the next file. For example, if file f1 has 100

MB free and file f2 has 200 MB free, one extent is allocated from file f1,

two extents from file f2, and so on. In this way, both files become full at

about the same time, and simple striping is achieved"

-------------------------------------------------
Will C,
MCITP 2008 Database Admin, Developer

Steve Jones - SSC Editor SSC Guru Points: 736173 More actions · Answer 4

SQL has, and still does, use proportional fill, meaning that it attempts to distribute data so that the total data in the files is equal, not the data you add. So if you add new data, more of it goes to the file(s) with the least amount of data in them.

eccentricDBA SSCarpal Tunnel Points: 4844 More actions · Answer 5

Ok...I think this is a little closer to the answer you are looking for.

Q1) I have a filegroup with two files. I add a third file, use some of the space, and then do a rebuild of the index that takes up most of the space in the original two files. Why doesn't the newly rebuilt index get spread evenly across the three files? I.e. why doesn't SQL Server rebalance the data across the files?
A1) There are two things to consider here. The first is the way that SQL Server allocates space from multiple files in a filegroup. It uses a mechanism called proportional-fill that will allocates space from files in round-robin fashion, but weights the allocations towards files that are larger and have more free space. In the example above, the space in the newly-added third file will be used before the first and second files are grown to add more space. The second thing to consider is that the process of rebuilding an index requires building a new copy of the index before dropping the old on - so in the example above, the existing allocated space can't be reused until after the index rebuild operation completes.
The concept of adding a file and having SQL Server rebalance the data across the files doesn't exist. It was something I proposed during SQL Server 2005 development but we (seriously) didn't have time to do it. The solution I recommend is to create a new filegroup with as many files as you need, rebuild the index into the new filegroup using the CREATE INDEX WITH DROP_EXISTING command, and then drop the old filegroup.

source: http://www.sqlskills.com/BLOGS/PAUL/post/Conference-Questions-Pot-Pourri-1-Indexes-stats-corruption-and-Enterprise-only-features.aspx

Pam Brisjar SSChampion Points: 12094 More actions · Answer 6

Steve,

This is the explanation I've always heard and understood to mean that the file size / space used would eventually even out, so that the smaller files with less data would get the bulk of the new data until they grew to meet the size of the larger files. That's not what is happening, though. "Proportional fill" appears to mean that they get new data in proportion to their current size / space used.

Pam Brisjar SSChampion Points: 12094 More actions · Answer 7

eccentricDBA:

This makes sense and fits with what I am seeing. Solution is bad, though (and what I'm trying to avoid) as the files are very large and getting down time is near impossible.

Steve Jones - SSC Editor SSC Guru Points: 736173 More actions · Answer 8

I think I've phrased it wrong. The weighting has to do with free space, so if you have different size files (I've never done this), it can result in some weird balancing out. It's designed to get an even amount of free space in all files.

ricer Right there with Babe Points: 722 More actions · Answer 9

Proportional fill has to do with current free disk space remaining on each file. So if file1 is 10 GB in size but only 1 GB of free space remaining and file2 is 4GB in size and 2 GB of free space remaining, the SQL Server will try to use up double the space on file2 compared to file1 to fill up both files at about the same time.

-------------------------------------------------
Will C,
MCITP 2008 Database Admin, Developer