Estimating table size - problems with MS fromulas

  • Hi,

    Recently I was requested to estimate the future size of table in SQL Server 2008. Currently this table is very large ca. 500mln rows and every mistake in calculation may influence the result. Hence, to calculate the size of the table I did not just multiply possible future number of rows with row size based on column length but I used MS fromulas which are published on MSDN

    http://msdn.microsoft.com/en-us/library/ms175991.aspx

    Unfortunately the numbers which I got absolutely did not match with current table size and even in one equation I got divide by zero result :w00t:

    I also try to use current statistics from sys.sysindex table, where I can find used pages and rows numbers for clustered index. But after multiplying it with page size number doesn’t fit with the sp_spaceused data result.

    Have you ever used those calculations and it matched with current table size or maybe I made some mistake which I am not aware of?

    Cheers,

    Bartek

  • You shouldn't be getting divide by zero errors when estimating the size of a row. If you are it's down in the estimating how much data per page calculations, which largely aren't that useful, per se, until you're down to the seriously grotty details of trying to squeeze that last 3-5 ms of performance out of the system.

    Are you simply trying to estimate how much disk space you need? The general calculations of adding the columns data types together and taking into account indexes and their sizes out to do the job.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you for your answer.

    Yeap, indeed I had input wrong fill factor number which was 0 and I should have input 100. Providing 0 to the formula breeds divide by zero error.

    Regarding your easy calculation I still do not get what I wished to have. Let me provide some example. After executing sp_spaceused on table_A I got:

    Rows: 537495176

    Reserved: 62933528 KB

    Data: 34555608 KB

    Index_size: 28373120 KB

    Unused: 4800 KB

    Total column length is 122B

    Hence if I multiply column length with rows number I got ca. 64037434 KB which differ with Reserved and Data number. I assumed that if we take total column length with rows we will know much data takes only then we have to add Indexes to get total table size.

    How should I interpret those sp_spaceused numbers?

  • The only thing you can do without additional information is to find the "total cost" in bytes per row using (DataSize+IndexSize)/Rows.

    The next step would be to look at the data and see how many rows were added per month for the last 6 months to establish a trend and then extrapolate that trend out to your target date.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply