Problem
When I first developed the FAQ area on
SQLServerCentral.com, it performed wonderfully. After a number of large FAQs
began to be entered in though and as the library grew to over a few hundred
entries, I noticed that the web page would some times display much slower and
often timeout. I tried everything to remedy like more aggressive caching and
query hints. None worked and the frustrated readers continued to e-mail me about
the FAQ area and the Script areas slowness..
Solution
Since we're at a hosting provider, we're limited to what we can do
hardware-wise to remedy this problem. For example, we can't move the text and
image columns to a different file group on a different drive array. So, I
decided to try the new (not so new anymore) TEXT IN ROW feature that ships with
all editions of SQL Server 2000. To understand this feature, you must first
understand how text, ntext and image columns work. Take for example the below
abridged schema:
CREATE TABLE [FAQ2] (
[FAQID] [int] IDENTITY (1, 1) NOT NULL ,
[TITLE] [varchar] (200),
[AUTHORID] [int] NULL ,
[RATING] [numeric](18, 4) NULL,
[FAQBODY] [text] NOT NULL)
Once this table is created, notice how in the Table Properties screen below
how it shows its size as 16 bytes. Text and image columns can store up to 2 GB
of data though. So, why the disparity? First you must remember the total amount
of space that can be stored on a data page in SQL Server, which is 8060 bytes.
Hardly enough room for a 2GB text field! So, instead of storing the text or
image column on the same data page, it simply stores a 16 byte pointer to where
it can be found. This is sometimes why text and image columns perform slower
than a varchar(2000). In our varchar(2000) example, SQL Server never needs to
leave the data page to find its data and perform reads much quicker than with a
text column.
TEXT IN ROW to the rescue! The TEXT IN ROW feature allows you to store
smaller amounts of text and image data onto the data page. The great thing about
this feature is it lets you have your cake and eat it to. It will store data
less than a preset amount on the data page and anything above is still done the
traditional way. It's a great way of solving the issue of an analyst coming to
you and saying they "may" someday need all those text columns, even though
they're only using 200 characters at the most.
To enable the feature, use the SP_TABLEOPTION system stored procedure and
specify the table as shown below:
SP_TABLEOPTION 'FAQ2',
'TEXT IN ROW', 'ON'
Now the feature is turned on and has defaulted to anything less than or equal
to 256 bytes will be stored on the data page. This 256 byte setting is the
optimal setting for the feature. You may want to set this higher nonetheless.
For example, I noticed by running the below query that the average column was
about 600 bytes.
SELECT AVG(DATALENGTH(FAQBODY)) FROM FAQ2
I also determined from the following query that the maximum that this column
was storing was 9,243, a little too much for a varchar field.
SELECT MAX(DATALENGTH(FAQBODY)) FROM FAQ2
So I took the average column length (600) and decided to store all columns
equal to or less than 600 bytes in the data row. You can do this by using the
SP_TABLEOPTION syntax again as shown below:
SP_TABLEOPTION 'FAQ2',
'TEXT IN ROW', '600'
As soon as you do enable, you will see no effect until new rows are inserted
into the table. Most people can't wait this long however so the workaround would
be to run a quick UPDATE statement like the one shown below to move the eligible
items back to the data page:
UPDATE FAQ2
SET FAQBODY = FAQBODY
Any new items are automatically moved if their eligible. You will also know
when you go to Table Properties (shown below), that the table shows the FAQBODY
text column as 600 bytes. Data stored in the data page operates much like a
varchar field. For example, if you try to insert 280 characters of data into the
FAQBODY column, only 280 bytes of space will be used in the row. Anything
greater than 600 bytes stores the 16 byte pointer in the row and the remainder
is stored off the data page.
When you turn the option off by using the below syntax, it may take a long
time for this to finish as it has to move all the data back off the data row.
This is a logged operation so your transaction log may fill up quickly and your
table will be locked while this is occuring.
SP_TABLEOPTION 'FAQ2',
'TEXT IN ROW', 'OFF'
Note : If you have multiple text or image columns in a table that has this
option turned on, SQL Server will try to fit as much as it can on a single data
page. Anything that can't fit will be stored off the data page. Make sure that
the combined size of all the columns does not exceed 8060 bytes to avoid any
problems with this.
Negative Effects of TEXT IN ROW
Before implementing this option in production, make sure you fully test your
application. This extensive testing would be needed to confirm that your
application does not use one of the following text and image statements:
- READTEXT
- UPDATETEXT
- WRITETEXT
This syntax will not work on tables using the TEXT IN ROW option.
Conclusion
We at SQLServerCentral.com have seen a tremendous improvement in the time out
rate of our FAQ area as soon as we turned on this option. The data retrieval of
these columns are at least 50% faster from our initial tests. As a general rule
make sure that you have justified why you're using a text or image columns and
confirm that these actions can't be done through a larger varchar field. Let us
know what you think of text and image column performance by clicking on the Your
Opinion button below.