Changing type from INT to BIGINT why so slow?

  • I was hoping SQL Server 2016 could handle change in types.
    With this table of numbers it takes an age to change the type of N to BIGINT.   As bigint is wider than int how come it's not just a case of metadata change.  I guess more stuff is happening under the hood. 
    I had to change the type of table with billions of rows and it took 3 hours !

    As an example:

    INSERT INTO [Numbers] ( N )
        SELECT TOP 10000000 ROW_NUMBER() OVER ( ORDER BY t1.number ) AS N
        FROM master..spt_values t1
           CROSS JOIN master..spt_values t2;

    ALTER TABLE [dbo].[numbers]
    ALTER COLUMN [n] BIGINT NULL;

  • leehbi - Monday, July 31, 2017 8:17 AM

    I was hoping SQL Server 2016 could handle change in types.
    With this table of numbers it takes an age to change the type of N to BIGINT.   As bigint is wider than int how come it's not just a case of metadata change.  I guess more stuff is happening under the hood. 
    I had to change the type of table with billions of rows and it took 3 hours !

    As an example:

    INSERT INTO [Numbers] ( N )
        SELECT TOP 10000000 ROW_NUMBER() OVER ( ORDER BY t1.number ) AS N
        FROM master..spt_values t1
           CROSS JOIN master..spt_values t2;

    ALTER TABLE [dbo].[numbers]
    ALTER COLUMN [n] BIGINT NULL;

    It needs to allocate 4 more bytes per row. That could create lots of pages splitting, blocking.
    You're basically moving all of your table to make space for those additional 4 bytes per row.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Right.  So to avoid the 1 row value overwriting the next the data has to move.  I wonder if there's any benefit from us doing the moving ourselves. i.e. clear out the table, change the schema, re-load data.

  • I'm not sure if this would work (I've never tried it), but you might want to test it.
    Rebuild the clustered index with a fill factor that allows enough room for the additional bytes to grow. As rebuilding indexes should be part of the normal maintenance, this shouldn't be a problem. :Whistling:
    After the index rebuild, the table should be altered without having to split pages.
    This is just why my logic says, but I don't have the tests to back it up.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Yes, altering column definition from INT to BIGINT is a change of size operation, meaning that every page in the table and every page in every index (clustered and non-clustered) would need to be updated. The biggest hit is all the transaction logging. The role of transaction log is to allow SQL Server to rollback any single transaction to it's previous state, so any DDL or DML operation that updates or deletes everything will end up writing at least as much data as the initial table and maybe 2x or 3x as much. You can mitigate transaction logging by splitting up the process into separate operations: dropping non-clustered indexes before ALTER COLUMN operation, and then re-creating indexes afterward.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • leehbi - Monday, July 31, 2017 9:00 AM

    Right.  So to avoid the 1 row value overwriting the next the data has to move.  I wonder if there's any benefit from us doing the moving ourselves. i.e. clear out the table, change the schema, re-load data.

    This CAN be done in a high performance minimally logged fashion IF you have the disk space for the duplicate table.  It can also be done without blowing out your MDF file.  I'll try to put an example together for such a thing tonight.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks folks. Interesting food for thought.   Solved the issue with an intermediate table. Thankfully, this operation isn't a frequent task.

  • leehbi - Tuesday, August 1, 2017 3:28 AM

    Thanks folks. Interesting food for thought.   Solved the issue with an intermediate table. Thankfully, this operation isn't a frequent task.

    My apologies... I lost track of this thread.

    Shifting gears, what did you end up doing with the intermediate table?  Renaming it to take the place of the old one or ???  Also, where did you end up creating the intermediate table?  In the same file/filegroup as the original table?  If so, what ended up happening to that file size-wize?  And was your intermediate table created with or without a clustered index?

    Heh... in other words, can you be a bit more specific about what you actually ended up doing with the intermediate table?  Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply