Capacity Planning: Figuring Out How Long Until You Need More Space
The vast majority of databases out there are growing and not shrinking. Some DBA have generous companies that provide all of the storage that they could ever use. Others of us have to be prepared to run out of space.
Why do we care? Do we provision space? Not usually, but if we ignore it, we could be running to the CIO with an emergency out-of-space situation, rather than giving the CIO a six-month – or better one year – space usage warning. The former could put you on the unemployment line; the latter could put you behind the wheel of that new Tesla.
In order to know when a given database may run out of space, you need to know a few facts. For each file group of the database, what are these values:
- The present size of its files.
- The amount of free space within its files.
- The growth increment of the file, and the maximum file size (if any).
- The free space on disk to grow those files.
- How rapidly each file group grows, best expressed in megabytes per day.
Capturing Usage Metrics
It is quite easy to capture database usage metrics with some tools. At my shop, we have an excellent monitoring tool with a statistics repository. I’m able to get the data I need by querying that repository. With dozens of live SQL instances and many terabytes of data, this query saves a great deal of time. I need not name what tool I use; suffice it to say that most of the major monitoring tool vendors capture the necessary data. That makes it quite easy since the data are already captured, and one can assemble a table and capture daily size information. For those who don’t have a monitoring tool (please let me know where to send the sympathy card), it is still possible for the DBA to capture file usage stats in a SQL job.
First of all, you need to have a database and table where you can collect the information you need. Here is a cleaned-up version of my table’s CREATE:
CREATE TABLE [dbo].[DBGrowthHistory]( [TimeCollected] [datetime] NOT NULL, [Monitored_object] [nvarchar](100) NOT NULL, [DBName] [nvarchar](100) NOT NULL, [FGName] [nvarchar](100) NOT NULL, [FileLogicalName] [nvarchar](100) NOT NULL, [totalmb] [float] NOT NULL, [freemb] [float] NOT NULL, [growthsremaining] [int] NOT NULL, [DBGrowthHistory_ID] [bigint] IDENTITY(1,1) NOT NULL, CONSTRAINT [PK_DBGrowthHistory] PRIMARY KEY CLUSTERED ( [DBGrowthHistory_ID] ASC ) WITH ( PAD_INDEX = OFF ,STATISTICS_NORECOMPUTE = OFF ,IGNORE_DUP_KEY = OFF ,ALLOW_ROW_LOCKS = ON ,ALLOW_PAGE_LOCKS = ON ,FILLFACTOR = 100 ) ON [PRIMARY] ) ON [PRIMARY];
Our monitoring tool captures the statistics for each file group and file name in the database, every single day. We just extract that information out of the tool and populate the table. For those of us without that luxury, I have devised a query to capture these statistics for every single database on a server. It is based on a query I found online from Satya Jayanty and I am thankful for this helpful query. Let’s have a look at what I was able to muster:
SELECT CONVERT(datetime,CONVERT(date,GETDATE())) as [TimeCollected] ,@@Servername as [monitored_object] ,db_name() as [DBName] ,b.groupname AS [fgName] ,Name as [FileLogicalName] ,[Filename] as [OSFileName] ,CONVERT (Decimal(15,2),ROUND(a.Size/128.000,2)) as [totalMB] ,CONVERT (Decimal(15,2),ROUND((a.Size-FILEPROPERTY(a.Name,'SpaceUsed'))/128.000,2)) AS [freeMB] FROM dbo.sysfiles a (NOLOCK) JOIN sysfilegroups b (NOLOCK) ON a.groupid = b.groupid ORDER BY b.groupname;
The above query runs in the database for which you desire to collect statistics. You can easily prepend this query with an insert and sp_msforeachdb in order to sweep all database stats for a SQL Server instance into the table.
Remaining Growth
So we have file sizes and free space in those files. What we don't yet have is remaining growth. This is the number of growths that the database file can do before it (a) fills up the disk; or (b) reaches the size limit for the file. This metric is vital, and it is impossible to calculate an accurate days-remaining prediction without it.
Example: Your database has a file A on disk E. The file is 30000MB with 1MB free, and grows by 1000MB (Never use percentage growth factors! See below.), with no limit. The disk is 500GB, which is 524288MB. The only file on that disk is your database file. Do you have 1000MB to go before you fill up, or do you have 483000MB? The latter number is closer to reality. Put the emergency defibrillator back into its wall kit and relax - or maybe not.
Multiple files on a drive: The chances that you have a 500GB volume with just one 30000MB (29.29GB) file are, in a normal production environment, more remote than winning both the PowerBall and Mega Millions jackpots in the same week. You are likely to have multiple database files on that drive. In that event, you have to look at two options:
- Using a calculation to split the space between the files; or
- Setting a maximum file size for each database file.
I prefer the second option. Some people are allergic to setting hard maximums, but with that setting, you can get actionable metrics for growth life. Moreover, since you would be assigning max file size so that you can confidently monitor growth and predict days-till-full, you would not be as likely to miss an impending-doom scenario - if you report the metric and pay attention to it.
Mixed files on a drive: This is the real problem scenario. If the drive is used for non-SQL items (e.g., backups, or [ugh] file shares, the availability of space is unpredictable. What solution then? Wild Guess? I think not. This one is a tough pill to swallow, but I would increase the file to the planned max file size, and then set max file size so that it cannot grow (max file size is equal to current file size or greater by an amount that is less than one growth). That way, other space hogs cannot grab the disk from you, and your database growth predictions will be all the more accurate.
How to Capture Available Disk Space
Now, if you don't have a monitoring tool that captures free disk space on all volumes and mount points (a mount point is a separate logical drive or LUN that is mounted as a folder name under an existing drive letter - you see these most often in larger SAN implementations that also involve failover clustering), you will need to grab these metrics. Here is a script that uses powershell and xp_cmdshell to retrieve what you need
sp_configure 'xp_cmdshell',1 go reconfigure go declare @svrName varchar(255) declare @sql varchar(400) --by default it will take the current server name, we can the set the server name as well set @svrName = @@SERVERNAME set @sql = 'powershell.exe -c "Get-WmiObject -ComputerName ' + QUOTENAME(@svrName,'''') + ' -Class Win32_Volume -Filter ''DriveType = 3'' | select name,capacity,freespace | foreach{$_.name+''|''+$_.capacity/1048576+''%''+$_.freespace/1048576+''*''}"' --creating a temporary table CREATE TABLE #output (line varchar(255)) --inserting disk name, total space and free space value in to temporary table insert #output EXEC xp_cmdshell @sql --script to retrieve the values in MB from PS Script output select @svrName as [ServerName] ,convert(datetime,convert(date,getdate())) as TimeCollected ,rtrim(ltrim(SUBSTRING(line,1,CHARINDEX('|',line) -1))) as drivename ,round(cast(rtrim(ltrim(SUBSTRING(line,CHARINDEX('|',line)+1 ,(CHARINDEX('%',line) -1)-CHARINDEX('|',line)) )) as Float),0) as 'capacity(MB)' ,round(cast(rtrim(ltrim(SUBSTRING(line,CHARINDEX('%',line)+1 ,(CHARINDEX('*',line) -1)-CHARINDEX('%',line)) )) as Float),0) as 'freespace(MB)' from #output where line like '[A-Z][:]%' order by drivename drop table #output go sp_configure 'xp_cmdshell',0 go reconfigure go
Things That Will Mess You Up
Conclusion
----
John F. Tamburo is the Chief Database Administrator for Landauer, Inc., the world's leading authority on radiation measurement, physics and education. John can be found at @SQLBlimp on Twitter. John also blogs at www.sqlblimp.com.