May 18, 2005 at 9:57 am
If a table has a column which is varchar(1000), yet all of the text within that column is only 20 characters long, is space being wasted in the database?
In other words, are only the 20 characters, (or # bytes) actually taking up space or does the declaration of varchar(1000) actually add to the space (or # bytes) being used?
Also, what utilities would need to be run to shrink any extra space taken up?
I need a good 'technical' reply and if possible, a URL to some microsoft explanation. This is beyond my realm of expertise to even know how to ask this correctly.
I will appreciate your compassion with my ignorance.
I wasn't born stupid - I had to study.
May 18, 2005 at 10:10 am
From the SQL Server Books on Line: Read the sentances regarding "Storage size"
To navigate directly to this subject, open BOL and from the menu select "go" then "url" and enter
mk:@MSITStore:C:\Program%20Files\Microsoft%20SQL%20Server\80\Tools\Books\tsqlref.chm::/ts_ca-co_7tpu.htm
char and varchar
Fixed-length (char) or variable-length (varchar) character data types.
char[(n)]
Fixed-length non-Unicode character data with length of n bytes. n must be a value from 1 through 8,000. Storage size is n bytes. The SQL-92 synonym for char is character.
varchar[(n)]
Variable-length non-Unicode character data with length of n bytes. n must be a value from 1 through 8,000. Storage size is the actual length in bytes of the data entered, not n bytes. The data entered can be 0 characters in length. The SQL-92 synonyms for varchar are char varying or character varying.
SQL = Scarcely Qualifies as a Language
May 18, 2005 at 10:11 am
Varchar doesn't pad the entries with extra data like spaces. Char datatypes do. You shouldn't have a problem with 'unused' space with varchar columns.
Search for "Using char and varchar Data" in Books Online for more/better info.
May 18, 2005 at 11:44 am
I read that in Books Online too. But its not quite accurate. In Kalen Delaney's book, Inside SQL Server 2000 (pp231-232), she states -
"A row with variable-length columns requires special offset entries in order to be internally maintained."
I didn't find a definition for the offset entry, but I found this in an example she gives -
"... And a 2-byte overhead exists for each of nine varchar columns..."
So, if your data is exactly 20 characters, a char(20) would be cheaper. If the data varies up to 20 characters, your varchar(1000) MIGHT be cheaper, depending on how many blank bytes would be stored in a char(20).
Steve
May 18, 2005 at 9:53 pm
A Variable length character string uses only enough bytes to actually hold the data. You can use the DataLength fuction to retrun the actual number of bytes used. Len() can be used to determine the number of characters in a string.
DECLARE @VarChar varchar(1000)
DECLARE @CharStr(20) --fixed length
Set @VARChar='123'
--DATALENGTH RETURNS THE NUMBER OF BYTES USED
Select DATALENGTH (@VARChar)
SELECT DATALENGTH(@CharStr) --always 20 Bytes with ASCII code, If UNICODE is used will require 40 bytes
If you are using an ASCII character set rather than UNICODE TSQL limits you to a string of 8000 characters. VARChar removes all trailing spaces from a string. Leading spaces and embedded spaces are included and count towards you 8000 character limit.
HTH Mike
Returns 3
May 19, 2005 at 4:42 am
the storage requirements of a varchar are the length of the data + some space to store info to work out how long/ where that data is (an offset/length - a number ie)
a char doesn't need this since it's fixed length - so that length is your offset/length
ie. if the data isn't fixed length, the db needs to be able to work out how long it is
May 19, 2005 at 5:03 am
Argh! I lost a really long post explaining this in detail. Anyway, the gist of the post was that like hoo-t says, rows containing varying size columns are stored with an offset array containing 2-bytes for every varying size column, which specifies the offset on the row where the data of each of these columns start. Here is some cleaned up data from DBCC PAGE that shows one row from a table foo (id int, text char(20)) and one from table bar (id int, text varchar(1000)):
Row from table foo
001c0010 00000001 61616161 61616161
61616161 61616161 61616161 000002
Row from table bar
00080030 00000001 01000002 61002300
61616161 61616161 61616161 61616161
616161
It is not really necessary to understand what all this means exactly. But you can see that the row from foo requires 31 bytes storage (there are some extra bytes for the row header, including a null bitmap) and the one from bar requires 35 bytes. The extra 4 bytes are the 2 bytes specifying the offset for the varying size column, and 2 bytes specifying where the offset array itself is placed.
So, the extra storage needed for a table with variying size columns is 2 bytes per column, plus another 2 bytes total.
May 19, 2005 at 6:27 am
Farrell, There are no stupid questions, only dumb terminals
But seriously, knowing where to find the answer to a question is half the battle and this forum always provides the answers, as demonstrated above.
I'll stop now.....
May 19, 2005 at 6:45 am
I forgot to mention (in my repost of the lost post) that the rows in each table are inserted as id=1 and text='aaaaaaaaaaaaaaaaaaaa'
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply