Changing type from INT to BIGINT why so slow?

Question

Changing type from INT to BIGINT why so slow?

leehbi

SSCrazy

Points: 2874
More actions
July 31, 2017 at 8:17 am

#339142

I was hoping SQL Server 2016 could handle change in types.
With this table of numbers it takes an age to change the type of N to BIGINT. As bigint is wider than int how come it's not just a case of metadata change. I guess more stuff is happening under the hood.
I had to change the type of table with billions of rows and it took 3 hours !
As an example:
INSERT INTO [Numbers] ( N ) SELECT TOP 10000000 ROW_NUMBER() OVER ( ORDER BY t1.number ) AS N FROM master..spt_values t1 CROSS JOIN master..spt_values t2;
ALTER TABLE [dbo].[numbers] ALTER COLUMN [n] BIGINT NULL;

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Luis Cazares SSC Guru Points: 183706 More actions · Answer 1

leehbi - Monday, July 31, 2017 8:17 AM
I was hoping SQL Server 2016 could handle change in types.
With this table of numbers it takes an age to change the type of N to BIGINT. As bigint is wider than int how come it's not just a case of metadata change. I guess more stuff is happening under the hood.
I had to change the type of table with billions of rows and it took 3 hours !
As an example:
INSERT INTO [Numbers] ( N ) SELECT TOP 10000000 ROW_NUMBER() OVER ( ORDER BY t1.number ) AS N FROM master..spt_values t1 CROSS JOIN master..spt_values t2;
ALTER TABLE [dbo].[numbers] ALTER COLUMN [n] BIGINT NULL;

It needs to allocate 4 more bytes per row. That could create lots of pages splitting, blocking.
You're basically moving all of your table to make space for those additional 4 bytes per row.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

leehbi SSCrazy Points: 2874 More actions · Answer 2

Right. So to avoid the 1 row value overwriting the next the data has to move. I wonder if there's any benefit from us doing the moving ourselves. i.e. clear out the table, change the schema, re-load data.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 3

I'm not sure if this would work (I've never tried it), but you might want to test it.
Rebuild the clustered index with a fill factor that allows enough room for the additional bytes to grow. As rebuilding indexes should be part of the normal maintenance, this shouldn't be a problem. :Whistling:
After the index rebuild, the table should be altered without having to split pages.
This is just why my logic says, but I don't have the tests to back it up.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Eric M Russell SSC Guru Points: 125562 More actions · Answer 4

Yes, altering column definition from INT to BIGINT is a change of size operation, meaning that every page in the table and every page in every index (clustered and non-clustered) would need to be updated. The biggest hit is all the transaction logging. The role of transaction log is to allow SQL Server to rollback any single transaction to it's previous state, so any DDL or DML operation that updates or deletes everything will end up writing at least as much data as the initial table and maybe 2x or 3x as much. You can mitigate transaction logging by splitting up the process into separate operations: dropping non-clustered indexes before ALTER COLUMN operation, and then re-creating indexes afterward.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 5

leehbi - Monday, July 31, 2017 9:00 AM
Right. So to avoid the 1 row value overwriting the next the data has to move. I wonder if there's any benefit from us doing the moving ourselves. i.e. clear out the table, change the schema, re-load data.

This CAN be done in a high performance minimally logged fashion IF you have the disk space for the duplicate table. It can also be done without blowing out your MDF file. I'll try to put an example together for such a thing tonight.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

leehbi SSCrazy Points: 2874 More actions · Answer 6

Thanks folks. Interesting food for thought. Solved the issue with an intermediate table. Thankfully, this operation isn't a frequent task.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 7

leehbi - Tuesday, August 1, 2017 3:28 AM
Thanks folks. Interesting food for thought. Solved the issue with an intermediate table. Thankfully, this operation isn't a frequent task.

My apologies... I lost track of this thread.

Shifting gears, what did you end up doing with the intermediate table? Renaming it to take the place of the old one or ??? Also, where did you end up creating the intermediate table? In the same file/filegroup as the original table? If so, what ended up happening to that file size-wize? And was your intermediate table created with or without a clustered index?

Heh... in other words, can you be a bit more specific about what you actually ended up doing with the intermediate table? Thanks.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)