Every since the first file was written to a floppy disk drive using DOS, physical file fragmentation has been a problem. Essentially, when the OS writes a file to a disk subsystem, and if contiguous clusters are not available, they are written elsewhere on disk. So when a file is stored on disk in a non-contiguous manner on a disk subsystem, the file is considered to be physically fragmented. Physical file fragmentation can contribute to an additional load on your I/O subsystem and reduce performance because the disks have to work harder to read and write data. This is because the heads on disk drive have to randomly jump around to different physical locations on the disk to find all of the data, instead of reading and writing the data sequentially, as when the data is contiguous.
Physical file fragmentation can be slight, say when a single file is divided into only a handful of locations on a subsystem. In other cases, a single physical file might be divided into thousands of fragments, resulting in heavy physical file fragmentation. The greater the physical file fragmentation, the harder disks have to work, and the greater overhead that is incurred. Ideally, for best I/O performance, files should be written as contiguously as practical.
How physical file fragmentation affects SQL Server’s performance varies. For example, if a single random read or write needs to be performed, the degree of physical file fragmentation will have little or no effect on I/O performance, as the amount of work to perform a single read or write is not significantly affected by physical file fragmentation. On the other hand, if SQL Server needs to read or write a large quantity of sequential data to or from disk, and the file is heavily fragmented, then the disks will have to randomly read or write the data, even though the data is logically sequenced, resulting in extra work for the disk drives, hurting I/O performance.
Another factor that can affect how SQL Server and the disk subsystem interact is if there is a large cache sitting between SQL Server and this subsystem, or if the disk subsystem in made up of SSD or similar drives. Both of these can significantly mask the negative effects of physical file fragmentation, although it is not eliminated.
So what does all of this mean to DBAs? Because of the complexities of measuring the negative impact of physical file fragmentation on SQL Server’s performance, I make the assumption that it is always bad and do my best to prevent it in the first place. As a DBA, I always assume the worse and plan for it.
Here’s what I like to do. Before I create any new database (MDF, NDF, or LDF files), I first use the Windows defrag.exe tool to see if there is any physical file fragmentation on the disk array where I want to create my database files. If there is, then I use defrag.exe, or a third-party tool, to defrag the physical files before creating my new database. Then, I pre-size my database files to a size that I think will suffice for the next year or so. For example, if I estimate that my MDF file will be 500 GB in size in the next 12 months or so, then that is the size I create the database now. The same with the other database files. By pre-sizing my database files, auto growth doesn’t have to kick in, which is one of the biggest causes of disk fragmentation. In other words, by creating a large file on a disk array that is defragmented, the file is created contiguously and I can prevent physical disk fragmentation from occurring in the first place.
Now, you may have two questions. First, you may be wondering about how you go about defragging a disk array that is currently being used with production databases. The problem is, is that the defrag.exe program will only defrag closed files, not open files, such as any SQL Server MDF, NDF, or LDF files that are in production. So the dilemma is, you have identified physical file fragmentation on an existing disk array where you want to create a new database, and you want to defragment it first before creating the new database to eliminate any physical file fragmentation. Assuming that you are using defrag.exe, your only option is to shut down the production databases during a maintenance window, perform the defrag, then restart SQL Server. Yes, this is a pain, but this one time pain will not only remove any existing physical file fragmentation in your current databases, it will help prevent it from future databases. As the DBA, you have to decide if the trade-off in taking the server down for a while, and potential performance gains it can provide, is worth it.
The second question you may have is how do you know what size you should pre-size your database files when you first create them? This requires experience and an educated guess. The odds are that your guess will not be exact. You will either underestimate or overestimate the size of the database 12 months in the future. There is no way to know for sure. But if you make an attempt using your best guess, your guess will be much better than letting auto growth grow the database for you, which will most likely result in physical file fragmentation. As time passes, and you realize that your guess was wrong, you can always manually grow the database as needed to make more room. Or, if you estimated too much, I would leave the extra room (and not shrinking the file sizes), unless you are in the immediate need of additional disk space. But won’t this also lead to physical fragmentation? Yes, to a limited degree. But most likely, a slight amount of physical file fragmentation will never be noticed.
The times where I have seen physical file fragmentation become a real problem is when an initial database is created with the default settings, where the MDB is set to a 1MB auto growth and the LDF is set to the 10% auto growth. If such a database is not pre-sized, but grows to multiple gigabytes over time, physical file fragmentation can become a substantial problem, especially if there are multiple databases, all using the default setting, on the same SQL Server instance.
In other words, a little physical file fragmentation is nothing to worry about. But a large amount of physical file fragmentation, often due to poor database management, can end up hurting your SQL Server’s I/O performance more than you might expect.