August 5, 2009 at 9:19 am
Hi all - This may be very simple but I wanted to get other opinions on my challenge. I have a table that contains several million rows. Within the table I have a char(1) NULL column that sits in the middle of the column list. I have a need to expand that column to varchar(2) NULL. A new business rule includes the possibility of a two character code. In the past, simply updating the length of the existing column took a very long time. Would it be faster to add a new column to the end of the table, insert the previous column data, drop the old column, and finally rename the new column?
August 5, 2009 at 9:34 am
The reason for the delay is datatype conversion from chr() to varchar2()
If your table is not too large meaning less than 50 million rows or so and you can afford a small maintenance window I would probably re-create the table resorting to CTAS.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.August 5, 2009 at 9:48 am
if you are using management tools then this will be very slow as management studio will work on the principle of making this backwards compatible (to sql7 i think - might be 6.5) and performs the following sequence
create table x
bulk insert from y into x
drop y
rename x as y
the following is your fastest way
alter table x alter column y varchar(2)
i tested this against 32 million rows and it performed in 1 second
MVDBA
August 5, 2009 at 10:05 am
Were the rows populated, or did it have nulls in the char(1) column?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 5, 2009 at 10:09 am
Thank you all for the replies. These is data in the column however some values are NULL.
August 5, 2009 at 10:32 am
michael vessey (8/5/2009)alter table x alter column y varchar(2)
i tested this against 32 million rows and it performed in 1 second
Quick question... was column "Y" a populated char(1) datatype column sitting in the middle of the row?
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.August 5, 2009 at 10:36 am
Paul - Yes to your question. The field in question is populated but is a NULLABLE field and it does sit in the middle of the row.
FYI - I ran the ALTER TABLE ALTER COLUMN statement above in our DEV environment. The comment took 13 minutes for 13 million rows. I watched activity monitor and frequently had PAGEIOLATCH_EX waits.
August 5, 2009 at 10:38 am
Rggg...typing. Command not comment.
August 5, 2009 at 11:37 am
david.tyler (8/5/2009)
Paul - Yes to your question. The field in question is populated but is a NULLABLE field and it does sit in the middle of the row.FYI - I ran the ALTER TABLE ALTER COLUMN statement above in our DEV environment. The comment took 13 minutes for 13 million rows. I watched activity monitor and frequently had PAGEIOLATCH_EX waits.
Sorry, my question was for Michael... hard to believe your scenario running on 1 second for 32 million rows.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.August 5, 2009 at 12:01 pm
It only takes a second when you are doing something like expanding the maximum size of a varchar field, because none of the actual data rows have to change. If it knew that all rows were null for the char(1), it could create a varchar(x) column and deactivate the old column.
But changing from a char(1) to a varchar, i'm sure pointers would have to be established on the page to support the varchar. It wouldn't necessarily all have to be done at once. Maybe it's smart enough to do that on a row by row basis (or page by page) as data is updated.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 5, 2009 at 2:36 pm
Follow Up: Thanks again everyone who replied. Here are my findings after testing both solutions:
Sol1: Alter the column with an Alter Table statement
Took 13 minutes to alter 13 million rows.
Sol2: Add a new col, copy data from old col to new, drop old col, rename new col
Took just under 3 minutes for the 13 million rows. Here is the script I used.
ALTER TABLE MyTable ADD newclass VARCHAR(2);
GO
UPDATE MyTable
SET newclass = Class;
GO
ALTER TABLE MyTable DROP COLUMN Class;
GO
EXEC sp_rename 'dbo.Segment.newclass', 'Class', 'COLUMN';
GO
August 6, 2009 at 2:24 am
yes - my column was populated
i converted it from 'a' to 'bb' 39 million rows
MVDBA
August 6, 2009 at 2:26 am
apologies - i went from varchar(1) to varchar(2)
i'd had too much coffee
MVDBA
August 6, 2009 at 5:54 am
I thought this article by Alok Dwivedi gave a good explaination of how long schema changes are likely to take to action on big tables:
http://www.sqlservercentral.com/articles/Design+and+Theory/67553/
Tim
.
August 7, 2009 at 6:24 am
Hey David,
On a large-ish table like this, even for a relatively modest change such as you describe, I would be very tempted to bulk copy the data out (ordered by the cluster key, if any), create a new table (with the clustered index), bulk load the data with the ORDER hint, and then create the non-clustered indexes.
You have to plan it well to ensure you get minimal logging (and things like foreign key relationships can make life harder) but it is an opportunity to ensure the schema will cope with future growth (VARCHAR(10) uses no more storage for the same size data than VARCHAR(2) does!), to defragment the table and indexes, and reclaim any space from dropped columns.
It is often just as quick to do this as to wait for a fully-logged ALTER TABLE statement to complete (unless the operation can be completed by just updating the metadata), and you get a much cleaner table as the end result. In your case, SQL Server must change every row on every page with full logging anyway.
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply