Choosing the "right" datatype for a field

  • We are developing a DB that may contain millions of records. I'm often puzzled when selecting the "right" data-type for a field. What data types would you select in the following cases:?

    1. A field always has 7 characters. I'd use char(7) over varchar(7) for this field. That's seems clear.

    2. A field contains 2 character (always integers), e.g. "23". This field is an ID that points to the string "Marin County" in another table. Would you use char(2) or tinyint for this field?

    3. A minimum-score field (numeric) ranges from 700 to 1000. Would you use smallint or char(4) for this field?

    Any guiding ideas much appreciated.

    TIA,

    Bill

  • i would not use char for numeric values,

    for the second,third fields i would use tinyint,smallint because

    1) it would be half the size of char

    2) you would have to put in some constraint to check for integer values

    if it was char

    3) you might want to do computations on the third field , in which case you would have to use conversion functions if it was char

    Hope this helps!

  • A good discussion question!

    My vote:

    1) Char 7. Though Im bad about using varchar anyway.

    2) Depends on whether I'd be translating it or not..or how often. In other words if the 2 char code was for state, I think I'd code with the 2 char. If the code was meaningless/unhelpful without translating, I'd say tinyint just to have some growing room.

    3)Small int I think. I guess either way you've got to do additional checking to enforce the range. Typically I try to always treat numbers as numbers and dates as dates.

    Andy

  • 1. I'd use CHAR for < 20 characters. I think I saw something about > 20 characers not really providing any benefits.

    2. Use integers. They join quicker. I'd use tinyint if you are very sure this will be less than 255. Otherwise use smallint.

    3. Use an integer. Math will be quicker.

    Steve Jones

    steve@dkranch.net

  • 1. Char(7)

    2. tinyint Because it is a smaller storage size and joining on int and such is easier for SQL to do that character data.

    3. smallint Same as 2

    Basically I subscride to the belief char is for fields that do no vary greatly in lenght. Int should be used for any storage of whole number and the smallest int type that can handle all of your values is best to streamline table and storage size. Numeric, decimal, float, money paying special care to provide the largest minimum amount of precision and scale needed as to take up the least space in storage but still handle. UserDefined for special needs.

    The key is to use the least amount of space to cover all the possible values and to stay in a values natural type as possible. Also special is char and varchar types. Char should always be used for cases on fixed lenght or not varying by 1 or 2 characters. Varchar is especially better in largly varying lenght values. I did hear about the char <20 >20 thing but have never seen a benchmark that has proven this, and the agreed upon belief is that SQL server does use more resources in scaling a varchars length in memory than dealing with a fixed length (again as long as the varying in length is not too great).

  • when designing your data base, you have to think about next year in addition to tomorrow. Consider how your well-known and small list of keys might change over time. what if your system goes global?

    For Internal keys that are not displayed to the user, I prefer to use INT fields. 4 bytes gives you 2 billion possible values and plenty of room for expansion. a 2-char key is somewhat simited in this situation.

    Consider how the data in the particular column will be used and what it represents, then choose the closest native type to that.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply