Text vs. Varchar and the 8060 Limit

  • Hi,

    I know this subject has been discussed before but I have not been able to find a definitive answer to my problem. So if there is a thread with an answer to my problems, please post a link to the thread.

    I'm currently in the process of converting data into a new SQL 2K database. I need to store "unlimited" text values in 3 separate columns in the same table. The current values of these 3 columns average 9000+ characters with the maximum values for each column being 11,000+ characters and these columns will continue to grow over time. These columns can not be broken up in any way. They are not like customer notes where I can break the notes up by using a separate table and use a date or identity column on the record. So that has really restricted the design ideas I came up with.

    I've read in many places about the maximum record size in SQL 2k being 8060. So why is there a datatype of Text which can store 2^31 characters and it's variable length? So, it's really a much larger varchar datatype. No? That is MORE than enough size for my 3 columns. Can I simply create 3 columns in the record with a datatype of Text? Can I put them all in the same "parent" table or should I spin them each off into their own separate "child" table? Or is that just too simplified? How are these Text columns stored physcially on disk? Pointers?

    What are the drawbacks to using the Text datatype? Are there any "gotch ya" to watch out for when using Text columns? I'd rather know now than 6 months after the app has been in production.

    Oh, and SQL 2005 is NOT an option.....so I've been told.

    And in case it matters, the front-end is VB .NET using .NET 1.1. And the users will only be able to view these large fields by clicking on a separate button. For performance reasons, these large text columns will not be retrieved with the rest of the record data.

    Sorry for such a long post. I'd appreciate any feedback/suggestions.

    Jason

  • Too bad I don't have the books online at hand at the moment. The text field like an image/binary field. It is stored separately from the record.

    Have a look at this article.

  • The text datatype works quite a bit differently from varchar in SLQ Server 2000. The method it uses to store text columns is to store the actual text elsewhere, and only store a pointer to this location in the actual row itself (you can make exceptions for small text data with the "text in row" option, but that doesn't seem to fit your requirements anyway. The main drawback is that working with text datatypes is not the same as working with varchars. You can't do things like "WHERE txtField LIKE '%Smith%'" with a text column, for instance.

    If the maximum length of any of them is really in the 11,000 range, I'd probably split them into two varchars, say the first 6000 chars in one column in one table, and the rest in a second column in a different table, with some sort of identifier tying them together (even a linked list would work if you don't have ids on this data). I know you say that won't work in your situation, but without a better understanding of the reason why, it's still the method I'd probably push in this situation.

  • TEXT datatype can store more than the allowable row length because the data is stored outside the table.  Only a 16 byte pointer to the page where the data can be found is stored in each row.  You can have more than one TEXT column in a table.

    The drawback to using TEXT is that every time the data is accessed SQL Server will have to perform additional page reads to retrieve the data.  There is an option called text in row that allows storage of up to 7000 bytes in the row itself, with any additional data stored in seperate pages.

    Look at 'Managing ntext, text, and image data' in BOL.

    Greg

    Greg

  • The only reason the fields can't be broken into multiple columns is because the users peform many ad-hoc queries/reports and they don't want to have to remember to query multiple columns. For me, it doesn't matter. I can make it appear as 1 record/column in the front end app. But you make a very good point about not being able to search text columns using LIKE '%smith%'. That will just not fly with the users and make the front app somewhat less capable. I've seen users on occassion search for one single word! ugh! So I'm leaning towards the use of multiple large varchar values, as David suggested.

    If I'm forced to use a Text column, would I have to write my own db function to parse through the text column to perform searches?

    So to understand the 16 byte pointer concept.....if I create 3 text columns in the table, all 3 columns would have a 16 byte pointer to potentially other data pages containing the data....we'll ignore the whole 'text in row' option for now. So with these 3 x 16 byte columns, this greatly reduces the record size and obviously does not exceed the magical 8060 max record size. So the data pages referenced by a retrieved record, via the pointers, are NOT considered in the total record size? But the data will be retrieved if selected in the query for that record? And I won't receive any "exceeded maximum record size" errors using this design? And obviously, I'll be taking a performance hit for the extra read data pages.

    Thanks David, Greg with the tips on the 16 byte pointers and the 'text in row' option. I'll definitely check them out.

    I've just "acquired" a copy of Inside SQL 2K, so I'll see what Kalen has to offer. And I'll read further up in BOL too.

    Many thanks!

  • I think I may have found my answer regarding searching text columns using SQL FTS (Full Text Search).

    Can anyone tell my what kind of performance hit the server would take when FTS is turned on? I realize it would depend on the different options used but in general terms, is there a lot of overhead to run FTS?

    Jason

  • You can't do things like "WHERE txtField LIKE '%Smith%'" with a text column, for instance.

    That's not true, you can use LIKE and IS NULL on a text column, you can't use any other operators. Full text gives a lot more powerful search options. Look in Bol for Contains and Freetext

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for the post!

    The only reason I was not going to use a Text column was because of the limited ability to search the column without having to use FTS. I definitely have some testing in my future.

    Thanks for all your advice.

    Much appreciated!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply