does it make difference nvarchar(50) or nvarchar(500)

  • dear friends,

    I have a large table with more than 10 million records and all fields are "string". I import data from csv file once in a month and this table will only be read by clients (web application)

    I used nvarchar(500) to be sure that field length will be enough in future. then I tried to experiment "what happens if I reduce the filed size to nvarchar(50)" ?

    I shrink database & files, changed 5 column types (among 50 columns) to nvarchar(50), shring again and mdf file size didn't changed. its meaningful because nvachar is not fixed length.

    my question is should I decrease the field sizes ? does it make any sense ? does it harm if I used "more than necessary" nvarchar lengths ?

    any suggestions ?

    thanks...

  • There's no performance or data size difference as they're variable length data types, so they'll only use the space they need.

    Think of the size parameter as more of a useful constraint. For e.g. if you have a surname field, you can reasonably expect 50 characters to be a sensible maximum size and you have more chance of a mistake (misuse of the field, incorrect data capture etc.) throwing an error, rather than adding nonsense to the database and needing future data cleansing.

    So, my general rule of thumb is make them as large as the business requirements demand, but no larger. It's trivial to amend variable data sizes to a larger length in the future if requirements change.

  • It can ABSOLUTELY make a difference, depending on the ANSI_PADDING setting for the columns when you created the table. If it is ON and you insert data that has spaces at the end then they are KEPT - bloating the data stored.

    You can use DATALENGTH or LEN functions to see how long fields actually are.

    I also question your use of Nvarchar(..). 99% of the time I see that it is UNNECESSARY because the client will NEVER store characters requiring 2 byte characters so that is a DOUBLING of storage requirements for nothing!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Only use nvarchar if you really need unicode otherwise use varchars. I make my varchars as small as possible because reports are generally formatted to handle the largest data stored. Another gotcha is defining rows larger than 8k which are fine until you actually exceed the size with data inserted.

    As with everything, clarity leads to success.

  • dear friends,

    I'm just confused about varchar and nvarchar types. I learned that nvarchar holds unicode character sets and varchar cannnot. varchar doubles the size. thats fine.

    What I confused about is unicode chars. I need all eurepean characters. French, German, English, turkish etc. But not chinese, japanese. only latin characters. should I use nvarchar or varchar ?

  • aykut canturk (8/21/2013)


    dear friends,

    I'm just confused about varchar and nvarchar types. I learned that nvarchar holds unicode character sets and varchar cannnot. varchar doubles the size. thats fine.

    What I confused about is unicode chars. I need all eurepean characters. French, German, English, turkish etc. But not chinese, japanese. only latin characters. should I use nvarchar or varchar ?

    Wikipedia or some other web search should be able to tell you if the languages you need to store require 2-byte character sets.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • aykut canturk (8/21/2013)


    What I confused about is unicode chars. I need all eurepean characters. French, German, English, turkish etc. But not chinese, japanese. only latin characters. should I use nvarchar or varchar ?

    From the languages you list, you should be able to get away with varchar if you use a Turkish collation.

    But don't do that. Suddenly you find that you will need to support Icelandic. Or Bulgarian or Greek to take something more nearby. If you start with varchar and need to change later, that will be very expensive. Do right from the start.

    As for the original question, yes nvarchar(50) and nvarchar(500) can affect performance. One is as Kevin says, if you have trailing spaces, which are truncated. Then again, if you trim the data before you store, this is not an issue. There is another area, though, and that is estimates for the optimizer. The optimizer will assume that an nvarchar(500) will return more bytes than an nvarchar(50) column, and this can affect the choice of query plans. Exactly to the better or to the worse is difficult to say, but generally if you have exagerated the size, the optimizer will have an incorrect estimate.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • aykut canturk (8/21/2013)


    dear friends,

    I have a large table with more than 10 million records and all fields are "string". I import data from csv file once in a month and this table will only be read by clients (web application)

    I used nvarchar(500) to be sure that field length will be enough in future. then I tried to experiment "what happens if I reduce the filed size to nvarchar(50)" ?

    I shrink database & files, changed 5 column types (among 50 columns) to nvarchar(50), shring again and mdf file size didn't changed. its meaningful because nvachar is not fixed length.

    my question is should I decrease the field sizes ? does it make any sense ? does it harm if I used "more than necessary" nvarchar lengths ?

    any suggestions ?

    thanks...

    Actually, it coud make a difference in size. For example, do you have any columns that represent only integer or date values? Take the lowly number 100, for example. In a VARCHAR column, it will take 1 byte for each digit and 2 bytes to remember the size for a total of 5 bytes. Stored as an integer, it would only take 4. Stored as a SMALLINT, it would only take 2 and stored as a TINYINT, it would only take 1.

    If you have any "code" columns, such as state abbreviations, that will never have more than 2 characters, storing them as VARCHAR will require 2 bytes for the letters in the abbreviation and 2 bytes to remember the length for a total of 4 bytes. If you use CHAR(2) instead, then you'll only use 2 bytes. Since you don't want the data in such a column to ever be more than 2 bytes, VARCHAR(anything) is extreme overkill. Storing numbers in the right-sized table columns also gives you a bit of free datavalidation, as well.

    Also, indexing can become an issue if everything is VARCHAR. At the very least, you'll get warnings about an index being too wide just by including two of the 500 character columns.

    Of course, all of those problems double when using NVARCHAR.

    And, to your last.... no... VARCHAR doesn't double the storage requirements. NVARCHAR does.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply