October 7, 2004 at 2:56 pm
Hi All,
This is my first thread on this forum. As a newbie to databases and SQL Server 2000, I need to know how to calculate the disk space required for ONE record assuming ALL defined data fields (columns) use maximum space allocated (worst case).
I have six tables with a wide variety of data types defined for each column (Just a question, say I've defined something as VarChar of length 8 - is this equal to 8 bytes? -- How does defining the length of the field correlate to the amount of bytes etc.?)
As I said above, I'm a newbie (electronic backgournd) to databases and would appreciate any help please.
Many thanks in advance... Ash
October 7, 2004 at 3:46 pm
The easy one first. For char and varchar fields, the length = number of bytes (almost). For nchar and nvarchar the length = 1/2 number of bytes because each character takes 2 bytes to store. varchar fields will only store for what is in use so if you store '12345' in a varchar(50) field only 5 bytes are used. That said, varchar and varbinary and nvarchar also store length which I believe consumes 2 bytes. Also, if the column is NULLable there is a byte for that (I think it shares with 7 other NULLable columns, but I'm not sure about that). For must other column types the size property reported by Enterprise Manager is the basic storage bytes for it. Text fields are stored as a 4 byte (I think 4) field pointing to an external resource that has all the actual data. A caveate to that is that up to a certain point it will try to store the entire Text value in the data row. I think the default threshold is around 255 bytes. Those are the basics for approximating row size.
Any indexed column will take up twice the storage. Once in the data row and again in the index. This gets confused more with Clustered indexes because of the leaf node storage stuff. You also have page management overhead for indexes.
All rows are stored in 8k pages. If a rows are a constant 1.7k in size then you could fit 4 rows to a page. the remaining 1.6k is wasted.
Then there is fragmentation which means pages are only partially used becuase a row was inserted into the middle of a full page (when clustered) causing the 1 page to become 2 partially filled pages to allow room for the new row.
So, for variable width columns determine what your average size will be. Use that to determine an approximate row size (ignoring indexes). Then determine how many rows can, on average, fit into an 8k page. This can then be used to estimate the core table size. Multiply by some factor such as 1.3 or 1.5 to account for clustering freespace, deleted rows, and page splits. Then calculate how long a key is, add a few bytes (8 ish?) and divide that into 8k truncating any remainder. That is how many index rows fit per page. You can then estimate the index storage size and multiply by some factor to account for non-filled pages.
Those are the basics. Perhaps someone else actually wrote a script to query the catalog and determine rowsize for you. Should your tables already have representative data in them try DBCC SHOWCONTIG.
DBCC SHOWCONTIG ( ATableName ) WITH TABLERESULTS, ALL_INDEXES, ALL_LEVELS
October 8, 2004 at 7:48 am
Hello
Books Online: "Estimating the size of a table" and "Estimating the size of a table with a Clustered Index"
October 8, 2004 at 3:09 pm
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply