int vs small int and tiny int

  • Hi,

    I had a conversation with one of my friends about using int data types in DB.

    I believe that in normal databases with size less than 1 GB there is no matter if you use int instead of tiny int or small int.On the other hand my friend claim that this can affect memory size, log size and etc.

    is it really important or not?

    Best Regards,
    Ashkan

  • This was removed by the editor as SPAM

  • I tend to agree with your friend. I think that the data type that should be used is the smallest one that can be used and hold all possible values. If tinyint is the smallest data type, then I don't see any reason to use a different and bigger data type. While I understand your point that in small databases it is not important, I think that this is just a habit that should be used regardless of the database's size. The main reason is that if I'll do it only on some databases and won't get used to do it all the time, I'll find my self using to big data type also on large databases.

    Adi

    --------------------------------------------------------------
    To know how to ask questions and increase the chances of getting asnwers:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I sincerely try to use the correct data type for the situation. Could I just plug int or bigint anywhere rather than think about if a tinyint or smallint can do the job? Sure, but it's not putting your best efforts into achieving an appropriate level of design and intent in the system. If a number is never going to be bigger than 32, don't make it an bigint.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • We generally should try to use the best possible data type. If tinyint can serve the purpose, smallint/ int should not be used. It affets the total row size. The minimum is the row size, maximum records can be brought into the cache and less no. of cache miss, improving the SQL Server performance.

  • Well, I know it is better in general, but what I want to say is that I cant find any major difference between them when the maximum rows in the table is 100k.... when you have at least 8 GB RAM installed on your server which is the lowest config of current servers, why do you need to use 2 Byte instead 4 Byte?

    it will change 100 MB data to 100MB+ 200 KB which is not important at all.

    By the way ,Generally I absolutely agree with making DB as small as we can, but this change wont make me happy:D

    I really need a real difference that can cause me any trouble , speed difference , etc....

    Best Regards,
    Ashkan

  • back in my vb6 days, i had to debug someone elses program that used vb6 int (32767 max size) on a program that had been in place for years that suddenly had trouble.

    Run time errors were occurring in the application ;their database table PK , which was an int and could support 2+ billion rows, was crashing because of the programming languages data size limitation.

    I had to change the code to use the vb6 Long instead to address the issue.

    It doesn't happen a lot, but some stuff will grow beyond the expected size years down the road. Smart sizing is good, but balance it with a hefty bit of planning for the unexpected.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • ashkan siroos (5/8/2012)


    Well, I know it is better in general, but what I want to say is that I cant find any major difference between them when the maximum rows in the table is 100k.... when you have at least 8 GB RAM installed on your server which is the lowest config of current servers, why do you need to use 2 Byte instead 4 Byte?

    it will change 100 MB data to 100MB+ 200 KB which is not important at all.

    By the way ,Generally I absolutely agree with making DB as small as we can, but this change wont make me happy:D

    I really need a real difference that can cause me any trouble , speed difference , etc....

    The problem is, pointing to one row and one column and saying "See, that 2 byte difference doesn't matter a whit" isn't the complete answer. It's cumulative. You're moving extra information on & off disk, in & out of memory, in & out of compression in the CPU, etc. And again, the difference is small, but cumulative. Further, defining something as a particular type of value also adds implications. For example, if a column has an int data type, is it a good candidate for an index? Yeah, possibly. If a column is a tinyint, is that a good candidate for an index? No, probably not. So what happens when you only ever have int's? You have to take an extra step, in all situations, to investigate the data stored (you need to understand your data when creating indexes anyway). And, this doesn't just apply to int data types. Why not make all characters VARCHAR(MAX)? After all, SQL Server will manage it appropriately for storage and everything else if it's actually not LOB storage. Shouldn't we also use datetime2 so that all dates automatically go from 1/1/0001 to 12/31/9999? That way we avoid issues with having to modify data structures in the year 2079 if we used smalldatetime. Again, this is only 4 bytes we're discussing. But it all accumulates, so that's why I try to design the structure to the need, rather than take shortcuts.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • ashkan siroos (5/8/2012)


    it will change 100 MB data to 100MB+ 200 KB which is not important at all.

    That's assuming you have only one case where you have a different data type and you only have 100MB of data.

    Let's say I have a large table with 10 smallint columns (not that unusual) and that table has 100 million rows (not that large)

    That's a difference of 2GB in one table compared to using int and 6GB for bigint. Then you think about all the other tables in the database. That's 2GB extra memory usage, 2GB extra in every single full backup, extra space in the transaction log, extra space in the log backups, extra time for maintenance, extra time to query, greater cache misses on the processor, more data going across the network.

    On second thoughts, use the biggest data type you can, then call me when things are slow. I need the work.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • it will change 100 MB data to 100MB+ 200 KB which is not important at all.

    so how many times a day is this extra 200k being read from disk? once ? a thousand times? 10 thousand? - it's not jsut about storage, it's also about optimising the flow of data in and out of your system

    MVDBA

  • GilaMonster (5/8/2012)


    ashkan siroos (5/8/2012)


    it will change 100 MB data to 100MB+ 200 KB which is not important at all.

    That's assuming you have only one case where you have a different data type and you only have 100MB of data.

    Let's say I have a large table with 10 smallint columns (not that unusual) and that table has 100 million rows (not that large)

    That's a difference of 2GB in one table compared to using int and 6GB for bigint. Then you think about all the other tables in the database. That's 2GB extra memory usage, 2GB extra in every single full backup, extra space in the transaction log, extra space in the log backups, extra time for maintenance, extra time to query, greater cache misses on the processor, more data going across the network.

    On second thoughts, use the biggest data type you can, then call me when things are slow. I need the work.

    I gotta agree with Gail, use the largest data type and call me - I have received a few of these sort of calls and it is these sort of things that make it so easy to look good 🙂

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply