Database size estimation

  • Hi All,

    Would you please direct me to a good link for Database/Table size estimation?

    Regards,

    Shaiju CK

    _____________________________________________
    One ounce of practice is more important than tonnes of dreams

  • Estimating the Size of a Database

    http://msdn.microsoft.com/en-us/library/ms187445.aspx

    Estimating the Size of a Table

    http://msdn.microsoft.com/en-us/library/ms175991.aspx

    Thanks
    Parthi

  • boy that MS document is really scarce on details and guidelines... "calculate it yourself based on details" is my take on it.

    I can't say whether this is right or not, but for me, I like to assume 6x the current database size for regular operations, and when i backup space is included, I estimate 23x the amount of a current database size, based on our shops business model.

    A lot of this is just WAG's and guestimates, but I haven't run out of disk space so far.

    I'm trying to take into account the following items:

    Space for Growth.

    my rule of [currentsize] * [Planned years of Support]

    For example, someone comes to me with an existing database, say it's 2 gig in size.

    they tell me the database is three years old.

    I fudge factor that and assume that the data was all created in the last year, so that's my assumed yearly growth margin.

    so assuming i will maintain the database for 5 years, that 2 gig [CurrentDatabaseSize] x 5 years = 10 gig.

    Space for Indexing and maintenance.

    my rule of thumb is 3 * [value from above] so that i have space to add additional indexes, as well as space for reindexing operations.

    when you need to reindex, you need 1.5x times as much space as the database actually occupies, right? so in my scenario, a current 2 gig database needs 4.5 gigs to reindex. the differentce on the space is for future indexes. so that's an additional 30 gig for my example, to accommodate whatever may happen up to 5 years down the road.

    Copy for Data Recovery.

    I need to be able to restore one or maybe two copies of the database at a minimum. so i need to be able to restore a 10 gig database once or twice on the same drive.

    Space for Backups.

    This is the space hog; I need to keep full backups, and be able to restore at some point in the history in case data gets dropped or updated wrong.

    Transactional backups might be factored in differently but here I assume you are only doing lazy full backups once a day. You are going to keep 7 days of full backups, and then one weekly/monthly, with a history of 6 months or so. that adds up to be 7 [daily backups] plus 4 [weekly backups] plus 6 [monthly backups] = 17x the current database size.

    Note I'm ignoring compressing or zipping backups for now, let alone archiving them off to tape or another drive or SAN. So assuming my maxed out 10 gig database, x 17 full backups = 170 more gig;

    add the 6x for db and 17x for backups, and I'm at 460 gig to provide for a db that's 2 gig today. I'd say that's way more space than i really need, but disk space is cheap nowadays.

    I'd like to hear how anyone else may calculating growth projections.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Hi Parthi / Lowell, Thank you very much for your replies.

    Space for Growth.

    my rule of [currentsize] * [Planned years of Support]

    For example, someone comes to me with an existing database, say it's 2 gig in size.

    they tell me the database is three years old.

    I fudge factor that and assume that the data was all created in the last year, so that's my assumed yearly growth margin.

    so assuming i will maintain the database for 5 years, that 2 gig [CurrentDatabaseSize] x 5 years = 10 gig

    Lowell, I have a doubt in the calculation in Space for Growth part. The db has accumulated 2gig within 3 years. But in formula it is

    2 gig [CurrentDatabaseSize] x 5 years = 10 gig

    . Are you assuming db will grow 2gig every year? Or whether the formula should be ([CurrentDatabaseSize] / [3 years]) x 5 years or not? Do we need to find the one year db size from the 3 years old 2gig db?

    _____________________________________________
    One ounce of practice is more important than tonnes of dreams

  • Lowell (4/11/2011)


    add the 6x for db and 17x for backups, and I'm at 460 gig to provide for a db that's 2 gig today. I'd say that's way more space than i really need, but disk space is cheap nowadays.

    I'd like to hear how anyone else may calculating growth projections.

    Holycow ...

    Disk space is cheap if you go for crappy one, here its around 24€/GB/Year for Tier1 data/log space (3x cheaper for backup).

    When someone comes with a new DB we plan based on the expected growth (rounded up a bit depending on who is asking), 50% of data space for log and 50% of data space for backup.

    So if someone comes with a 2GB DB with no special growth, we would plan for 3GB Data, 1.5GB log and 1.5GB backup.

    Each server has a "natural" disk growth of about 15% a year, if the expected growth of the DB is higher then we plan for that.

    We only plan for re-index space on a case by case basis if its a special case or if the DB is 100GB+.

    For the other the natural free space on the disk will be enough.

    For temporary restore space we either have the backup drive or the other environments.

    All disks and DB files are monitored all the time, so if we are getting close to a disk full or if a file is growing too fast we plan the growth/cleanup/migration with the owner.

    From experience it's pointless to plan several years ahead since most DB owner don't event know how big their DB will be after 1 year.

  • heck i'm throwing all sorts of numbers out there; i think i made my numbers extra huge to kick up some dirt so people would say "no! that's too much"

    ok call me a troll on this one....

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (4/12/2011)


    heck i'm throwing all sorts of numbers out there; i think i made my numbers extra huge to kick up some dirt so people would say "no! that's too much"

    ok call me a troll on this one....

    Ok so what do you actually use?

  • All this depends on how huge your data is going to come every day and how you are going to maintain.Lowell has given a approximate or near by value.Keeping a week's value we can able to say accurate value some what rough values can be given thats what Lowell has also given.No one can guess how much data will come in next 1 month or 3 month and so,it may be more or it may be less,its all dependent on the data going to come.We must be good enough to have data's stored so that we will not loose any data.No body can give you the accurate value since it is not static.

    Thanks
    Parthi

  • Talking about good enough. Guessing is 1 thing, traking it is another. I know this script has been made before, but I just build this this week for me.

    Tracks db growth via backup size history and also against free space if data file and HD. So if you stop cleaning the history in msdb then you can have a fairly accurate guess or growth (now withstanding one off imports / merges).

    --Logs what I consider errors

    USE [DBA]

    GO

    /****** Objet : Table [dbo].[RPT_Space_Warnings] Date de génération du script : 04/12/2011 11:04:34 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[RPT_Space_Warnings](

    [DateWarning] [datetime] NOT NULL CONSTRAINT [DF_RPT_Space_Warnings_DateWarning] DEFAULT (dateadd(day,(0),datediff(day,(0),getdate()))),

    [Warning_Description] [varchar](50) NOT NULL,

    [DaysUntillDBGrowth] [int] NULL,

    [database_name] [nvarchar](128) NOT NULL,

    [DriveLetter] [char](1) NULL,

    [FreeGBs_Drive] [decimal](18, 2) NULL,

    [FreeGBs_Drive_InMonths_WithExpected_DB_Growth] [decimal](18, 1) NULL,

    [BackupDate_First] [datetime] NULL,

    [BackupDate_Last] [datetime] NULL,

    [DaysPeriod] [int] NULL,

    [MinutesForBackup_First] [decimal](18, 1) NULL,

    [MinutesForBackup_Last] [decimal](18, 1) NULL,

    [MinutesForBackup_Delta] [decimal](19, 1) NULL,

    [GB_backup_size_First] [decimal](18, 3) NULL,

    [GB_backup_size_Last] [decimal](18, 3) NULL,

    [GB_BackupGrowth] [decimal](19, 3) NULL,

    [GB_ExpectedDailyGrowth] [decimal](18, 3) NULL,

    [GB_ExpectedAnnualGrowth] [decimal](18, 3) NULL,

    [Total Size in GB] [decimal](18, 3) NULL,

    [Available Space In GB] [decimal](18, 3) NULL,

    CONSTRAINT [PK_RPT_Space_Warnings] PRIMARY KEY CLUSTERED

    (

    [DateWarning] ASC,

    [Warning_Description] ASC,

    [database_name] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [Data Filegroup 1]

    ) ON [Data Filegroup 1]

    GO

    SET ANSI_PADDING OFF

    --Script in job

    IF OBJECT_ID('tempdb..#dbs') > 0

    DROP TABLE #dbs

    IF OBJECT_ID('tempdb..#Drives') > 0

    DROP TABLE #Drives

    IF OBJECT_ID('tempdb..#Results') > 0

    DROP TABLE #Results

    CREATE TABLE #dbs

    (

    DBNAME sysname

    , DBID INT

    , [Total Size in MB] INT

    , [Available Space In MB] INT

    , DriveLetter CHAR(1)

    )

    INSERT INTO

    #dbs

    (

    DBNAME

    , DBID

    , [Total Size in MB]

    , [Available Space In MB]

    , DriveLetter

    )

    EXEC sp_MSforeachdb '

    USE [?];

    SELECT

    DB_NAME() As DBNAME

    , DB_ID() AS DBID

    , SUM(size / 128) AS ''Total Size in MB''

    , SUM(size / 128 - CAST(FILEPROPERTY(name , ''SpaceUsed'') AS int) / 128) AS ''Available Space In MB''

    , LEFT(physical_name, 1) AS DriveLetter

    FROM

    [?].sys.database_files

    WHERE

    type_desc = ''ROWS''

    GROUP BY LEFT(physical_name, 1) '

    CREATE TABLE #Drives

    (

    DriverLetter CHAR(1) PRIMARY KEY CLUSTERED

    , FreeMBs INT NOT NULL

    , FreeGBs AS CONVERT(DECIMAL(18 , 2) , FreeMBs / 1024.0)

    )

    INSERT INTO

    #Drives ( DriverLetter , FreeMBs )

    EXEC xp_fixeddrives

    --

    --SELECT

    -- DB_NAME() As DBNAME

    -- , DB_ID() AS DBID

    -- , SUM(size / 128) AS 'Total Size in MB'

    -- , SUM(size / 128 - CAST(FILEPROPERTY(name , 'SpaceUsed') AS int) / 128) AS 'Available Space In MB'

    --FROM

    -- sys.database_files

    --WHERE

    -- type_desc = 'ROWS'

    --Rémi : I deleted 4 logging tables I had build on March 25th, hence the ±350 MB drop.

    ;

    WITH CTE_Backups ( database_name, BackupDate, MinutesForBackup, GB_backup_size, seqFirst, seqLast )

    AS (

    SELECT

    bs.database_name

    , DATEADD(D , 0 , DATEDIFF(D , 0 , bs.backup_start_date)) AS BackupDate

    , CONVERT(DECIMAL(18 , 1) , DATEDIFF(s , bs.backup_start_date ,

    bs.backup_finish_date)

    / 60.0) AS MinutesForBackup

    , CONVERT(DECIMAL(18 , 3) , bs.backup_size / 1024 / 1024 / 1024) AS GB_backup_size

    , ROW_NUMBER() OVER ( PARTITION BY bs.database_name ORDER BY bs.backup_start_date ) AS seqFirst

    , ROW_NUMBER() OVER ( PARTITION BY bs.database_name ORDER BY bs.backup_start_date DESC ) AS seqLast

    FROM

    msdb.dbo.backupset bs

    WHERE

    name IS NULL

    AND bs.[type] = 'D'

    )

    SELECT

    CONVERT(INT , dtBackups.[Available Space In GB]

    / CASE WHEN dtBackups.GB_ExpectedDailyGrowth <> 0

    THEN dtBackups.GB_ExpectedDailyGrowth

    ELSE 0.0001

    END) AS DaysUntillDBGrowth

    , *

    INTO

    #Results

    FROM

    (

    SELECT

    a.database_name

    , dbs.DriveLetter

    , drv.FreeGBs AS FreeGBs_Drive

    , CONVERT(DECIMAL(18 , 1) , ( drv.FreeGBs * 0.85 )

    / CONVERT(DECIMAL(18 , 3) , ( b.GB_backup_size - a.GB_backup_size )

    / DATEDIFF(dd , a.BackupDate , b.BackupDate) * 30.468)) AS FreeGBs_Drive_InMonths_WithExpected_DB_Growth

    , a.BackupDate AS BackupDate_First

    , b.BackupDate AS BackupDate_Last

    , DATEDIFF(dd , a.BackupDate , b.BackupDate) AS DaysPeriod

    , a.MinutesForBackup AS MinutesForBackup_First

    , b.MinutesForBackup AS MinutesForBackup_Last

    , b.MinutesForBackup - a.MinutesForBackup AS MinutesForBackup_Delta

    , a.GB_backup_size AS GB_backup_size_First

    , b.GB_backup_size AS GB_backup_size_Last

    , b.GB_backup_size - a.GB_backup_size AS GB_BackupGrowth

    --, a.seqLast - a.seqFirst AS QtyofBackups

    , CONVERT(DECIMAL(18 , 3) , ( b.GB_backup_size - a.GB_backup_size )

    / DATEDIFF(dd , a.BackupDate , b.BackupDate)) AS GB_ExpectedDailyGrowth

    , CONVERT(DECIMAL(18 , 3) , ( b.GB_backup_size - a.GB_backup_size )

    / DATEDIFF(dd , a.BackupDate , b.BackupDate) * 365.256) AS GB_ExpectedAnnualGrowth

    , CONVERT(DECIMAL(18 , 3) , dbs.[Total Size in MB] / 1024.0) AS [Total Size in GB]

    , CONVERT(DECIMAL(18 , 3) , dbs.[Available Space In MB] / 1024.0) AS [Available Space In GB]

    FROM

    CTE_Backups a

    INNER JOIN CTE_Backups b

    ON a.seqFirst = b.seqLast

    AND a.seqLast = b.seqFirst

    AND a.database_name = b.database_name

    INNER JOIN #dbs dbs

    ON b.database_name = dbs.DBNAME

    INNER JOIN #Drives drv

    ON dbs.DriveLetter = drv.DriverLetter

    WHERE

    a.seqFirst = 1

    ) dtBackups

    ORDER BY

    database_name

    IF EXISTS ( SELECT

    *

    FROM

    #Results R

    WHERE

    R.FreeGBs_Drive_InMonths_WithExpected_DB_Growth < 1

    OR R.DaysUntillDBGrowth < 30 )

    BEGIN

    INSERT INTO

    dbo.RPT_Space_Warnings

    (

    [DaysUntillDBGrowth]

    , [Warning_Description]

    , [database_name]

    , [DriveLetter]

    , [FreeGBs_Drive]

    , [FreeGBs_Drive_InMonths_WithExpected_DB_Growth]

    , [BackupDate_First]

    , [BackupDate_Last]

    , [DaysPeriod]

    , [MinutesForBackup_First]

    , [MinutesForBackup_Last]

    , [MinutesForBackup_Delta]

    , [GB_backup_size_First]

    , [GB_backup_size_Last]

    , [GB_BackupGrowth]

    , [GB_ExpectedDailyGrowth]

    , [GB_ExpectedAnnualGrowth]

    , [Total Size in GB]

    , [Available Space In GB]

    )

    SELECT

    [DaysUntillDBGrowth]

    , CASE WHEN R.FreeGBs_Drive_InMonths_WithExpected_DB_Growth < 1 THEN 'HD IS FULL' ELSE 'AUTOGROWTH WARNING' END AS Warning_Description

    , [database_name]

    , [DriveLetter]

    , [FreeGBs_Drive]

    , [FreeGBs_Drive_InMonths_WithExpected_DB_Growth]

    , [BackupDate_First]

    , [BackupDate_Last]

    , [DaysPeriod]

    , [MinutesForBackup_First]

    , [MinutesForBackup_Last]

    , [MinutesForBackup_Delta]

    , [GB_backup_size_First]

    , [GB_backup_size_Last]

    , [GB_BackupGrowth]

    , [GB_ExpectedDailyGrowth]

    , [GB_ExpectedAnnualGrowth]

    , [Total Size in GB]

    , [Available Space In GB]

    FROM

    #Results R

    WHERE

    R.DaysUntillDBGrowth < 30

    UNION ALL -- I want to see 2 warnings in the same day when it's the case... those are the really critical events.

    SELECT

    [DaysUntillDBGrowth]

    , CASE WHEN R.FreeGBs_Drive_InMonths_WithExpected_DB_Growth < 1 THEN 'HD IS FULL' ELSE 'AUTOGROWTH WARNING' END AS Warning_Description

    , [database_name]

    , [DriveLetter]

    , [FreeGBs_Drive]

    , [FreeGBs_Drive_InMonths_WithExpected_DB_Growth]

    , [BackupDate_First]

    , [BackupDate_Last]

    , [DaysPeriod]

    , [MinutesForBackup_First]

    , [MinutesForBackup_Last]

    , [MinutesForBackup_Delta]

    , [GB_backup_size_First]

    , [GB_backup_size_Last]

    , [GB_BackupGrowth]

    , [GB_ExpectedDailyGrowth]

    , [GB_ExpectedAnnualGrowth]

    , [Total Size in GB]

    , [Available Space In GB]

    FROM

    #Results R

    WHERE

    R.FreeGBs_Drive_InMonths_WithExpected_DB_Growth < 1

    END

    IF OBJECT_ID('tempdb..#dbs') > 0

    DROP TABLE #dbs

    IF OBJECT_ID('tempdb..#Drives') > 0

    DROP TABLE #Drives

    IF OBJECT_ID('tempdb..#Results') > 0

    DROP TABLE #Results

    --In step 2 of job... I use perm table because I want to e-mail the results. Tho it might look better with a SSRS report.

    IF EXISTS ( SELECT

    *

    FROM

    dbo.RPT_Space_Warnings

    WHERE

    DateWarning = DATEADD(D , 0 , DATEDIFF(D , 0 , GETDATE())) )

    BEGIN

    EXEC msdb.dbo.sp_send_dbmail @profile_name = 'JOB SQL' ,

    @recipients = 'it@it.it' ,

    @query = 'SET NOCOUNT ON; SELECT * FROM [DBA].dbo.RPT_Space_Warnings WHERE DateWarning = DATEADD(D , 0 , DATEDIFF(D , 0 , GETDATE()))' ,

    @subject = 'AVERTISSMENT FORDIASQL' ,

    @attach_query_result_as_file = 1 ,

    @body = 'Le fichier data de la db ou le disque dur est en train de manquer de place.

    Voir les détails de la requête pour plus d''infos.

    Ou rouler cette requêre : SELECT * FROM [DBA].dbo.RPT_Space_Warnings WHERE DateWarning = DATEADD(D , 0 , DATEDIFF(D , 0 , GETDATE()))' ,

    @importance = 'HIGH' , @query_result_separator = '|'

    END

  • I tend to do what the ninja does. Track database full backup size history. That will tell you what the growth is and what your data size, at least close enough for estimation. Comparing this to db size, as a percentage, also allows you to manage your data file sizes.

    All the other stuff about table size is a little silly to me. Transaction rates are hard to track, and not worth it. You're not trying to manage bytes here. Typically you manage MB, or preferably, GB of space.

  • Just by curiosity,

    why do you track backup size instead of data file size?

  • File size includes free space in the DB. Backup size does not.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (4/13/2011)


    File size includes free space in the DB. Backup size does not.

    It's trivial to get the file size without free space.

    You do need the free space to plan for growth anyway, so you might as well get it all at the same time.

  • Oliiii (4/13/2011)


    GilaMonster (4/13/2011)


    File size includes free space in the DB. Backup size does not.

    It's trivial to get the file size without free space.

    You do need the free space to plan for growth anyway, so you might as well get it all at the same time.

    Totally agree. Did ya run the script or at least read the columns name before telling us what I don't do :-D?

    There's a reason why I send out 2 warnings, 1 for autogrowth imminent and 1 for hd running out of space.

  • Not telling anyone what to do 🙂

    Just wondering why some folks used backup size, maybe the file size is harder to get in SQL 2000 or some other reason I'm not aware of.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply