Estimate Table size

  • How to estimate table size where row length is more than 8096?

  • Please be more specific.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • This article may be of help.

    http://jasonbrimhall.info/2010/05/19/tablespace-update/

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • here's two different ways i can think of;

    you have to lookm at the column size of the data for anything that could potentially be greater than ~8000 chars.

    by looking at the column definitions, you can get a look at the max size of a row, right?

    --just worry about rows with lots of varchars?

    select

    object_name(object_id),sum(max_length)

    from sys.columns

    where max_length > 0 --ignore -1 (max) columns?

    group by object_id

    --or worry about all rowsizes

    create table ##tmp (TableName varchar(40),DefinedRowSize int)

    sp_msforeachtable 'INSERT INTO ##TMP Select ''?'' As TableName, SUM(C.Length) as Length from dbo.SysColumns C where C.id = object_id(''?'') AND C.max_length > 0 --ignore -1 (max) columns?

    '

    select * from ##tmp order by DefinedRowSize desc

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Jason, Lowell, note the OP said "estimate". Your solutions are geared towards already existing tables.

    DIB IN, please clarify what you're looking for.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (7/25/2011)


    Jason, Lowell, note the OP said "estimate". Your solutions are geared towards already existing tables.

    DIB IN, please clarify what you're looking for.

    True. Also, if this is not an existing table, then the number of records in the table would be requisite.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • If you insist on an estimation, there's a section in Books Online called "estimating table size" in the index. Personally, I think that's a huge PITA although you could certainly build a nice spreadsheet from the forumulas they give you.

    Me? I never estimate such a thing. I generate a million rows of data to match the table structure, add all the required indexes, constraints, etc, etc, and then I use sp_SpaceUsed to tell me what all the gazinta's are.

    It's not so very difficult to generate a million rows of data if you understand what a SELECT/INTO with a Cross Join will do. For example, here's how I generate my typical million row test table... and it only takes minutes to write and seconds to run but give you a very accurate answer with sp_SpaceUsed. And, no... your test doesn't have to create the "sophisticated" data that's in my test table. You just need to generate rows of data that comes close to what you expect to store in the table even if it's all the same data for every row. 😉 sp_SpaceUsed will show how much room the "pristine" (no page splits in the data, no extent splits in the NC indexes) for a million rows. After that, it's easy with a little simple math. Add 10 to 20% on the indexes for expected extent splitting and Bob's your Uncle. 🙂

    /**********************************************************************************************************************

    Purpose:

    Create a voluminous test table with various types of highly randomized data.

    --Jeff Moden

    **********************************************************************************************************************/

    --===== Conditionally drop the test table to make reruns easier

    IF OBJECT_ID('dbo.JBMTest','U') IS NOT NULL

    DROP TABLE dbo.JBMTest

    ;

    --===== Create and populate a 1,000,000 row test table.

    -- "SomeID" has a range of 1 to 1,000,000 unique numbers

    -- "SomeInt" has a range of 1 to 50,000 non-unique numbers

    -- "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

    -- "SomeMoney has a range of 0.00 to 100.00 non-unique numbers

    -- "SomeDateTime" has a range of >=01/01/2000 and <01/01/2020 non-unique date/times

    -- "SomeDate" has a range of >=01/01/2000 and <01/01/2020 non-unique "whole dates"

    -- "SomeName" contains random characters at random lengths from 2 to 20 characters

    SELECT TOP 1000000

    SomeID = IDENTITY(INT,1,1),

    SomeInt = ABS(CHECKSUM(NEWID())) % 50000 + 1,

    SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID())) % (ASCII('Z')-ASCII('A')+1) +ASCII('A'))

    + CHAR(ABS(CHECKSUM(NEWID())) % (ASCII('Z')-ASCII('A')+1) +ASCII('A')),

    SomeMoney = CAST(RAND(CHECKSUM(NEWID())) * 100 AS DECIMAL(9,2)), --Note rounding

    SomeDateTime = RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2020') + CAST('2000' AS DATETIME),

    SomeDate = ABS (CHECKSUM(NEWID())) % DATEDIFF(dd,'2000','2020') + CAST('2000' AS DATETIME),

    SomeName = RIGHT(NEWID(),ABS(CHECKSUM(NEWID())) % 19 + 2)

    INTO dbo.JBMTest

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    ALTER TABLE dbo.JBMTest

    ADD CONSTRAINT PK_JBMTest PRIMARY KEY CLUSTERED (SomeID) WITH FILLFACTOR = 90

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • http://msdn.microsoft.com/en-us/library/ms189124%28v=SQL.90%29.aspx

    This above article defines to estimate table size based on table structure and number of estimated records for the table.

    While estimating a table size I found that the table record size is exceeding 8060 bytes. The extra bytes are stored in ROW_OVERFLOW_DATA page.

    How to calculate size for ROW_OVERFLOW_DATA for excess bytes?

    can we validate table size using below procedure:-

    sp_spaceused <Table name>

    Regards

  • DIB IN (7/26/2011)


    http://msdn.microsoft.com/en-us/library/ms189124%28v=SQL.90%29.aspx

    This above article defines to estimate table size based on table structure and number of estimated records for the table.

    While estimating a table size I found that the table record size is exceeding 8060 bytes. The extra bytes are stored in ROW_OVERFLOW_DATA page.

    How to calculate size for ROW_OVERFLOW_DATA for excess bytes?

    Books Online does cover this topic:

    Estimating the Size of a Table (contains links for heaps, clustered and non-clustered indexes)

    Row-Overflow Data Exceeding 8 KB (contains links for Large-Value Data Types (i.e. MAX), text, image and XML data)

    can we validate table size using below procedure:-

    sp_spaceused <Table name>

    Yes, sp_SpaceUsed does count LOB and row overflow data.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (7/26/2011)


    DIB IN (7/26/2011)


    http://msdn.microsoft.com/en-us/library/ms189124%28v=SQL.90%29.aspx

    This above article defines to estimate table size based on table structure and number of estimated records for the table.

    While estimating a table size I found that the table record size is exceeding 8060 bytes. The extra bytes are stored in ROW_OVERFLOW_DATA page.

    How to calculate size for ROW_OVERFLOW_DATA for excess bytes?

    Books Online does cover this topic:

    Estimating the Size of a Table (contains links for heaps, clustered and non-clustered indexes)

    Row-Overflow Data Exceeding 8 KB (contains links for Large-Value Data Types (i.e. MAX), text, image and XML data)

    can we validate table size using below procedure:-

    sp_spaceused <Table name>

    Yes, sp_SpaceUsed does count LOB and row overflow data.

    As does the script for which I provided a link.:-D

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • SQLRNNR (7/26/2011)


    opc.three (7/26/2011)


    DIB IN (7/26/2011)


    http://msdn.microsoft.com/en-us/library/ms189124%28v=SQL.90%29.aspx

    This above article defines to estimate table size based on table structure and number of estimated records for the table.

    While estimating a table size I found that the table record size is exceeding 8060 bytes. The extra bytes are stored in ROW_OVERFLOW_DATA page.

    How to calculate size for ROW_OVERFLOW_DATA for excess bytes?

    Books Online does cover this topic:

    Estimating the Size of a Table (contains links for heaps, clustered and non-clustered indexes)

    Row-Overflow Data Exceeding 8 KB (contains links for Large-Value Data Types (i.e. MAX), text, image and XML data)

    can we validate table size using below procedure:-

    sp_spaceused <Table name>

    Yes, sp_SpaceUsed does count LOB and row overflow data.

    As does the script for which I provided a link.:-D

    No question about it Jason. Your script and sp_spaceused both use the same two system views to check data pages, sys.partitions and sys.allocation_units, so I would expect them to be interchangeable from that perspective. That said, your script is much more useful IMHO that sp_spaceused. I like the breakdown it gives and have added it to my toolkit. Thanks for sharing it.

    The request for "estimating" table sizes is much more leading than asking about measuring the size of existing tables, e.g. in migrations from other platforms, loading terabytes worth of data from files, etc., so I was curious about the use-case.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (7/26/2011)


    SQLRNNR (7/26/2011)


    opc.three (7/26/2011)


    DIB IN (7/26/2011)


    http://msdn.microsoft.com/en-us/library/ms189124%28v=SQL.90%29.aspx

    This above article defines to estimate table size based on table structure and number of estimated records for the table.

    While estimating a table size I found that the table record size is exceeding 8060 bytes. The extra bytes are stored in ROW_OVERFLOW_DATA page.

    How to calculate size for ROW_OVERFLOW_DATA for excess bytes?

    Books Online does cover this topic:

    Estimating the Size of a Table (contains links for heaps, clustered and non-clustered indexes)

    Row-Overflow Data Exceeding 8 KB (contains links for Large-Value Data Types (i.e. MAX), text, image and XML data)

    can we validate table size using below procedure:-

    sp_spaceused <Table name>

    Yes, sp_SpaceUsed does count LOB and row overflow data.

    As does the script for which I provided a link.:-D

    No question about it Jason. Your script and sp_spaceused both use the same two system views to check data pages, sys.partitions and sys.allocation_units, so I would expect them to be interchangeable from that perspective. That said, your script is much more useful IMHO that sp_spaceused. I like the breakdown it gives and have added it to my toolkit. Thanks for sharing it.

    The request for "estimating" table sizes is much more leading than asking about measuring the size of existing tables, e.g. in migrations from other platforms, loading terabytes worth of data from files, etc., so I was curious about the use-case.

    That's cool. For the record - I was just teasing.:hehe::hehe::hehe:

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Very nice script Jason... going to my vault.

  • SQLRNNR (7/26/2011)


    That's cool. For the record - I was just teasing.:hehe::hehe::hehe:

    😎 Cool, just giving you due credit, I was not trying to bypass it in my earlier post, just answering the OP's question 🙂

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • DIB IN (7/25/2011)


    How to estimate table size where row length is more than 8096?

    Clustered vs. Heap, fill factor, fragmentation, and several other factors may influence how the actual size of your table with X number of rows. In the Dev environment, insert an even number of test rows into an empty copy of the table; maybe 10 or 100. If you're still in the conceptual design phase, then just create a table based on what columns and datatypes you think it will contain at this point. Once done, then use sp_spaceused to get an actual size, and then use that size to extrapolate the table size for 100,000 rows, 1,000,000 rows, or whatever you're interested in.

    delcare @x int; select @x = 1;

    while @x <= 100

    begin

    insert into MyTable ( ... ) select ... ;

    select @x = @x + 1;

    end;

    sp_spaceused 'MyTable';

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply