How to convert all PK columns in database from numeric(12,0) to int

  • I am wondering if there is a way to convert a huge number of numeric PK and FK columns to int in one big action.

    We have a database that has a PK numeric(12,0) column as a primary key for each table. All foreign key relations are referring to these columns. All foreign keys have been indexed. All tables are clustered on the primary key (this is not optimal but this is a different issue). I know that since int is machine native and numeric has to be converted all the time, int is quicker. Moreover int takes less storage. From small tests I have estimated the storage would decrease by at least 10%. I would not know what the performance gain is or how to estimate it.

    Besides the FK indexes and the PK clustered indexes there are more indexes also including the FK columns. These will also be affected by the conversion.

    Does someone have experience with such conversions? I can imagine that the performance improvement would not be worthwile. The database serves a multi user planning application where the most load is data selection.

    Any advice is greatly appreciated.

    Regards,

    Bouke

  • That would be a huge amount of work. You would need to:

    Drop all indexes

    Drop all foreign keys

    Drop all primary keys

    Convert all columns in all tables, one at a time (ALTER TABLE ... ALTER COLUMN) making sure you don't fill the tran log in the process

    Recreate all primary keys

    Recreate all foreign keys

    Recreate all indexes

    Personally I'd never consider doing that. There's little to no performance difference between int and numeric, if both sides of any joins are numeric then there are no conversions taking place, they're roughly the same size (5 bytes vs 4). It would be a massive amount of work for little to no gain

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I agree with Gail. There's probably not enough ROI to merit all the work or to offset how much disk space is being used. How big is your database now and how big do you think it will be a year from now?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GilaMonster (6/9/2012)


    That would be a huge amount of work. You would need to:

    Drop all indexes

    Drop all foreign keys

    Drop all primary keys

    Convert all columns in all tables, one at a time (ALTER TABLE ... ALTER COLUMN) making sure you don't fill the tran log in the process

    Recreate all primary keys

    Recreate all foreign keys

    Recreate all indexes

    Personally I'd never consider doing that. There's little to no performance difference between int and numeric, if both sides of any joins are numeric then there are no conversions taking place, they're roughly the same size (5 bytes vs 4). It would be a massive amount of work for little to no gain

    The NUMERIC(12,0) datetype uses 9 bytes, not 5. NUMERIC 1 through 9 is 5 bytes.

    http://msdn.microsoft.com/en-us/library/ms187746(v=sql.100).aspx

    However, it's probably not worth the effort for any performance gain.

    This sounds like an application that was originally designed for ORACLE. I've seen NUMERIC(12,0) used in a number of ORACLE databases.

  • Thanks for your answers and indeed this application was designed on Oracle originally. There are numerous instances of this database around ranging from 10G to 50G (but still growing) so the storage decrease only would not be a valid argument to change the types. If performance would improve notably we would consider the conversion.

    The actual conversion might be less work than indicated by Gila. The database definitions are stored in an xml model. We have an update tool that automatically detects any changes between the database and the model and changes the database accordingly. Dropping indexes, FKs and PKs might have to be done separately but the update tool will then correctly re-create all objects. Changing the type in the xml is a textual find replace.

    By the way I thought that since numeric is not native the CPU has to convert it before knowing how to deal with it. A join on 2 numeric indexed columns would required conversion from numeric to the native format for both columns. Int is native and requires no conversion. Together with the fact that int requires less bytes and comparison is thus quicker, this lead me to think there might be performance improvement.

  • Bouke Bruinsma (6/10/2012)


    By the way I thought that since numeric is not native the CPU has to convert it before knowing how to deal with it.

    Heh????

    If you're talking about the CPU word, that's 64-bit wide on 64 bit, 32-bit wide on 32-bit CPUs and not particularly relevant to SQL.

    XML model or no XML model, this is way more work than I'd ever consider for minimal to no gains.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Bouke Bruinsma (6/10/2012)


    Together with the fact that int requires less bytes and comparison is thus quicker, this lead me to think there might be performance improvement.

    Have you actually done a test in that area? You may be grossly disappointed if you do.

    The databases that you're talking about (10GB to 50GB) are relatively small. According to your own estimate of 10% savings, that means that your databases MIGHT drop in size to 9GB and 45GB. Even with your magic XML tool (I'd like to know what that is, by the way... appreciate it), what would be the ROI? Slightly faster automated backup and maintenance code? Slightly (almost imperceptable) faster queries? Smaller footprint on relatively inexpensive and readily available media? Even If you had ten such databases or the savings were 20%, I'd say to carefully examine the ROI and really determine if it's going to be worth it.

    Just to confirm what the others have said, although using Numeric(12,0) isn't normally what's done on new databases, it is a frequent result from Oracle conversions and it is a standard datatype in SQL Server. And, yes, it takes 9 bytes instead of 5. That may change your 10% estimate quite a bit. In either case, you should really make sure the ROI of such a change is really going to be worth it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My goal was to find out if such a conversion would benefit performance enough to consider it. But given the comments my view is that there is not enough reason to consider it.

    I have not done any tests for performance but some for storage as that is quite easy. These tests show that at least 10% will be saved but probably more. I had come across some recomendations for database design that all advise to keep index columns as narrow as possible.

    The tool is a company internal tool that cannot be shared unfortunately. It is quite useful as it can extract a database script and apply it for upgrades.

  • Yes, clustered index columns should be as narrow as possible, 9 bytes is not wide. Even with a nice tool to do the work, you still have the time it will take (and on non-trivial databases that can easily be hours or days) and you still have to make sure that the log doesn't grow too large and that nothing fails and that you have a large enough maintenance window to do the change.

    You'd also have to change any procedures that have parameters that are used with the pk/fk columns and/or any front end apps that do direct queries otherwise you will get implicit conversions in queries and those will hurt performance badly.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply