using varchar and char

  • hey guys,

    I'm just confused on this data types (varchar,char)

    If a value shorter than the length of the column is inserted into a char datatype, right padding was added unless I set the ANSI_PADDING OFF

    But how about in varchar? is it also doing the same thing as char datatype? And is there any disadvantage of setting ANSI_PADDING OFF?

    Thanks in advance

  • Iin char it is paddedup with space but in case of varchar it will not.

    varchar is a variable length datatype so it will always hold as many bytes at there is data

    char is fixed length datatype and will pad shorter data with spaces to fill it's capacity

    its depend on your requirement to choose the data type,suppose if its a gender column and you know the data is either M/F then you go for char(1)

    which is of fixed size and where you are not sure of the size of data you will go for varchar datatype.

    NOTE: there are possibalties to use datatype bit on replace of char(1)/varchar(1).

  • thanks for the reply, so meaning, it's all about knowing the expected length of the value. But is there an impact if I always set the ANSI_PADDING to OFF?

  • First of all , we should admit that no question about that Varchar (n) + nvarcahr (n) is much better than char (n) +nchar(n) respectively since :

    •ANSI_PADDING off is going to be a deprecated feature by next SQL Server version

    •Irrespective that , it applies only for any further columns created ,,but still the old Char columns follow up the same regulations of static column size

    [font="Arial Black"]Performance Guard[/font]

    [font="Courier New"]Just ask me to get 0 sec for all queries[/font]

  • ANSI_PADDING is deprecated so just don't use it. http://msdn.microsoft.com/en-us/library/ms187403.aspx

    What he was trying to explain is that the system will right pad with spaces any values into a char field that is shorter than the column width.

    declare @CharField char(10)

    set @CharField = 'a'

    select @CharField --will return 'a '

    declare @VarCharField varchar(10)

    set @VarCharField = 'a'

    select @VarCharField -- will return 'a'

    Which one you choose comes down to your situation. Personally I tend to not use char fields much (some exceptions of course) due to the varying length of most text type data.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • thanks guys for all that info. 🙂

  • ANSI_PADDING, in older versions, referred to whether trailing blanks shoud, or should not, be retained in varchar columns. If 'ON', then they would be retained. If 'OFF', they would be automatically removed. That's like the difference between the LEN() and DATALENTH() functions.

    The setting had no impact on char columns.

    This also had an impact on varbinary columns, in that trailing binary zeroes would be removed when the setting was 'OFF'.

  • Another thing to keep in mind is that varchar storage takes more bytes than char storage, since it has to also store the length. Plus, because of the way data pages are stored in indexes and heaps, char storage can be more efficient and won't result in page splits (variable length datatypes do that, fixed don't).

    So, as a general rule of thumb, don't use varchar for 10 or less characters. That's not as important as the differences between ASCII and UNICODE, or any number of other database rules, and 10 is a semi-arbitrary number. But it can make a difference in busy/large databases.

    For exampe, if a column will contain ISO-2 and ISO-3 language abbreviations, you could use Varchar(3). That will take 4 bytes of storage for ISO-2 and 5 for ISO-3. Char(3) will take 3 for either one. (There's more to total storage size, but they'll be same in other regards for both data types.)

    Picking which to use depends a lot on data length distributions.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (10/13/2011)


    Another thing to keep in mind is that varchar storage takes more bytes than char storage, since it has to also store the length. Plus, because of the way data pages are stored in indexes and heaps, char storage can be more efficient and won't result in page splits (variable length datatypes do that, fixed don't).

    Varchar uses 2 bytes to store length

    Thus, my general rule is: Purely from a space perspective, if the average length of your data is at least 3 bytes less than the maximum length of the field, use VARCHAR. If the average is only 0 or 1 bytes less than the maximum, use CHAR. If it's 2 bytes exactly, it's a tossup.

    GSquared, please chime in with more detail, but the page split case is, I believe, based on storing a length N value, and then when you UPDATE that to a length N+M value (where M is at least 1), if the page was full, the page will need to be split, which is, of course, fragmentation.

    If your values often are updated to a larger size, and you use a 100% fillfactor, and you have poor/no/expensive index maintenance, CHAR may be superior for those particular fields. If you use a lower fillfactor and have good index maintenance, I'd lean towards the general rule above.

  • Nadrek (10/14/2011)


    GSquared (10/13/2011)


    Another thing to keep in mind is that varchar storage takes more bytes than char storage, since it has to also store the length. Plus, because of the way data pages are stored in indexes and heaps, char storage can be more efficient and won't result in page splits (variable length datatypes do that, fixed don't).

    Varchar uses 2 bytes to store length

    Thus, my general rule is: Purely from a space perspective, if the average length of your data is at least 3 bytes less than the maximum length of the field, use VARCHAR. If the average is only 0 or 1 bytes less than the maximum, use CHAR. If it's 2 bytes exactly, it's a tossup.

    GSquared, please chime in with more detail, but the page split case is, I believe, based on storing a length N value, and then when you UPDATE that to a length N+M value (where M is at least 1), if the page was full, the page will need to be split, which is, of course, fragmentation.

    If your values often are updated to a larger size, and you use a 100% fillfactor, and you have poor/no/expensive index maintenance, CHAR may be superior for those particular fields. If you use a lower fillfactor and have good index maintenance, I'd lean towards the general rule above.

    The page-split data is based on the fact that fixed-length data is stored first in the row-data, and variable-length is stored last, and if the variable length-data grows, it can result in the page being split.

    This is partially dependent on fill factor. But fill factor is only used in initial allocation of the page, and not in updates to it. So, you could have 80% (for example), and if a variable-length datum grows so that the page is over-full, it will still split. You could even have a situation like initially starting with 10 rows of data in a page, and 80% fill factor, and if 8 of the rows grow, you could end up with a page split on the 8th one.

    So, it's not just fill factor and average size, it's also about volatility of the data.

    Also keep in mind, at a mechanical level, updating a variable length value requires updating both the column-value and the length-value, while updating a fixed-length column requires updating only the column-value. Hardly likely to matter, but it is a consideration on high-volatility data.

    I'd be more inclined to go by average deviation in size, not just by average size, but average size is probably good enough. I'm lazy about it and usually don't monitor either, and I've found that <= 10 datalength often works better as char, even for variable data.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (10/17/2011)

    I'd be more inclined to go by average deviation in size, not just by average size, but average size is probably good enough. I'm lazy about it and usually don't monitor either, and I've found that <= 10 datalength often works better as char, even for variable data.

    I've seen too many columns which commonly/mostly/(almost) entirely hold empty strings to use a purely length based split.

  • Nadrek (10/18/2011)


    GSquared (10/17/2011)

    I'd be more inclined to go by average deviation in size, not just by average size, but average size is probably good enough. I'm lazy about it and usually don't monitor either, and I've found that <= 10 datalength often works better as char, even for variable data.

    I've seen too many columns which commonly/mostly/(almost) entirely hold empty strings to use a purely length based split.

    Yep. That's the thing about "rules of thumb". There are always exceptions.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply