are varchar(max) columns always stored out-of-row?

  • Are varchar(max) columns always stored out-of-row or does it depend on how much data is in there?

    Likes to play Chess

  • VARCHAR(MAX) column values will be stored IN the table row, space permitting. So if you have a single VARCHAR(MAX) field and it's 200, 300 byte, chances are it'll be stored inline with the rest of your data. No problem or additional overhead here.

    Only when the entire data of a single row cannot fit on a single SQL Server page (8K) anymore, only then will SQL Server move VARCHAR(MAX) data into overflow pages.

    So all in all, I think you get the best of both worlds - inline storage when possible, overflow storage when necessary.

    Likes to play Chess

  • Thank you!

    Likes to play Chess

  • Actually, best is often to force max values out of row, for every row.  That is generally my preference, although naturally there are a few specific exceptions.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • partially incorrect - reading the manual helps so do read the "Create table" statement on the manuals where you may learn something.

    and having (max) data inrow is in some cases rather bad as it slows down all queries that do not need those (max) columns.

  • frederico_fonseca wrote:

    partially incorrect - reading the manual helps so do read the "Create table" statement on the manuals where you may learn something.

    and having (max) data inrow is in some cases rather bad as it slows down all queries that do not need those (max) columns.

    Heh... there's no need for the snarky comment here Frederico.  Even if one reads Microsoft Documentation, they can get it wrong.  A superb example is the "Best Practice" index maintenance that most of the world has adopted and that most "experts" agree with and then also write about.  It's nearly totally incorrect but it even appears on interview question even though there are strong hints in the documentation that it actually is incorrect in most situations.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I mostly agree with Scott and Frederico on this subject... I almost always force LOBs to be out of row for a table.  I'll add that I also generally default them to default to a single space so that the pointer (lordy, 16 bytes for these) materializes except on the most sparse of columns (might use a different table, then) to also prevent "ExpAnsive" updates, which will cause massive "Bad" page splits, the resulting fragmentation, and some pretty nasty blocking.  If you let the default In-Row LOB thing happen, then the pointers  will be 24 bytes instead of just 16.  In the big picture, not much to worry about but worth mentioning for lot's of rows that would probably go out of row anyway.

    In row LOBs will also cause a problem that I call "Trapped Short Rows", which further exacerbates the problem with Clustered Index Scans and the excessive use of memory that in-row LOBs frequently cause.

    There are also a couple of other undocumented "tricks of the trade" that can be realized by forcing LOBs out of row like having indexes that are "Defragmented by Default" but it's much too long to go into on a forum post.

    Just as a quick "look-see", here's what the  Clustered Index on the "Posts" table of the Stack Overflow 2013 database looks like even after both a REORGANIZE and a REBUILD... permanent fragmentation in the form of Page Density problems caused by the "Trapped Short Row" problem for In-Row LOB storage.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply