G’day,
I’ve noted a few instances lately of cases where VARCHAR(MAX) columns have been used unnecessarily, when a small value (for a VARCHAR) would have been fine.
Apart from the storage implications of a VARCHAR(MAX) column there is also the issue that in SQL SERVER 2008, having a column of this type (a LOB) in your table will prevent any ONLINE index rebuilds of the clustered index. And while, a non cluster index on this table will be able to be rebuilt ONLINE – as long as it does not INCLUDE the VARCHAR(MAX) column – or any other LOB column for that matter, the inability to preform ONLINE index operations may be an unintended side effect.
I’m sure that other people have seen this too. For example, a table containing names, where nobody wanted to make a call about a simple thing like the maximum length of the last name, so a VARCHAR(MAX) column has been added to the table – totally unnecessarily!
So let’s just create the table that was mentioned in the paragraph above and have a look at some of the unintended impacts that this will have no ONLINE index operations. Please note that ONLINE operations are only available in the higher versions of SQL SERVER 2008
USE TestDB GO --DROP THE TABLE IF IT EXISTS IF OBJECT_ID('dbo.TestTable', 'U') IS NOT NULL BEGIN PRINT 'DROPPING TABLE '; DROP TABLE dbo.TestTable; END GO --CREATE THE TABLE NOW IT'S DEFINITELY BEEN REMOVED CREATE TABLE dbo.TestTable ( PK_ID INT NOT NULL IDENTITY(1,1), FIRST_NAME VARCHAR(30) NOT NULL, LAST_NAME VARCHAR(MAX) NOT NULL, PHONE VARCHAR(20) ); --Lets add a few rows INSERT INTO dbo.TestTable(FIRST_NAME, LAST_NAME, PHONE) VALUES('ALBERT' , 'EINSTEIN' , '0800 MC2'); INSERT INTO dbo.TestTable(FIRST_NAME, LAST_NAME, PHONE) VALUES('STEVE' , 'JOBS' , '0800 APPLE'); GO
Before we look at the impact on ONLINE index operstions, let’s first make a note of the impact that the above table has on our initial choice of indexes.
First off the LOB column can’t be used as a primary key (or as any index key for that matter) – but let’s try, just to see what happens.
ALTER TABLE dbo.TestTable ADD CONSTRAINT PK_TESTABLE PRIMARY KEY CLUSTERED (LAST_NAME) GO
whoops….
ALTER TABLE dbo.TestTable ADD CONSTRAINT PK_TESTABLE PRIMARY KEY CLUSTERED (PK_ID) GO
OK, so now let’s look at the effect on ONLINE index operations, because this looks like a table that could grow real big, and in a busy database we’ll need access to this all the time – so we don’t want any downtime that will break our SLA’s caused by long running index rebuilds – so we’ll rebuild the index ONLINE.
ALTER INDEX PK_TESTABLE ON dbo.TestTable REBUILD WITH (ONLINE=ON);
Whoops..
Msg 2725, Level 16, State 2, Line 1
An online operation cannot be performed for index ‘PK_TESTABLE’
because the index contains column ‘LAST_NAME’ of data type text,
ntext, image, varchar(max), nvarchar(max), varbinary(max), xml,
or large CLR type. For a non-clustered index, the column could
be an include column of the index. For a clustered index, the
column could be any column of the table. If DROP_EXISTING is used,
the column could be part of a new or old index.
The operation must be performed offline.
OK, so we’ll try to drop the index ONLINE
ALTER TABLE dbo.TestTable DROP CONSTRAINT PK_TESTABLE WITH (ONLINE=ON);
Same error message again.
So, we’ll just cheat and drop the index offline (if we did this on a very large table then the index would be inaccessible while it was being dropped )
ALTER TABLE dbo.TestTable DROP CONSTRAINT PK_TESTABLE;
OK, so now we don’t have a primary key on the table at all, we assume we want to put the original one back, only this time we’ll use an ONLINE operation to do it
ALTER TABLE dbo.TestTable ADD CONSTRAINT PK_TESTABLE PRIMARY KEY CLUSTERED (PK_ID) WITH (ONLINE=ON); GO
and our old friend msg 2725 is back again.
OK, our conclusion here is that we pretty much can’t make any ONLINE alterations to the primary key when we have a LOB column in the table.
So lets see how non clustered indexes hold up.
It’d be kind of good to have a non clustered index on first name and last name, somebody might search on that, so we’ll attempt to create that as a non clustered.
CREATE NONCLUSTERED INDEX NC1_TESTTABLE ON dbo.TestTable(FIRST_NAME , LAST_NAME); GO
and we’re back to the error message
Msg 1919, Level 16, State 1, Line 1
Column ‘LAST_NAME’ in table ‘dbo.TestTable’
is of a type that is invalid for use as
a key column in an index.
But, what if we have a non clustered index that does not include the LOB column, like so
CREATE NONCLUSTERED INDEX NC2_TESTTABLE ON TestTable(FIRST_NAME); GO
Can we re-build that ONLINE
ALTER INDEX NC2_TESTTABLE ON TestTable REBUILD WITH (ONLINE=ON); GO
We sure can.
So, can we get our LOB column to appear in any index at all?
Well, yes, we can use an INCLUDEd INDEX, like so – but first let’s try creating the index ONLINE
CREATE NONCLUSTERED INDEX NC3_TESTTABLE ON TestTable(FIRST_NAME) INCLUDE(LAST_NAME) WITH (ONLINE=ON) GO
back to our old friend, error message 2725
So we’ll just do it as an offline operation
CREATE NONCLUSTERED INDEX NC3_TESTTABLE ON TestTable(FIRST_NAME) INCLUDE(LAST_NAME) WITH (ONLINE=OFF) GO
and it’s successful! Note that we specifically set ONLINE to OFF above, but that’s the default anyway so we don’t have to do it.
In summary, the main point I wanted to make here is that a poor choice of data type at design time (in this case a LOB rather than a small VARCHAR) can have enormous consequences down the track when the amount of data becomes large (or even when it doesn’t)
anyway, if you’re still reading, I hope you enjoyed this.
Have a nice day.
Cheers
Martin.