varchar(max) or varchar(some large number)?

  • I'm creating a table where some columns accept free text from users (for things like summaries, descriptions, etc.). Now, I could declare all those columns as varchar(max) and be done with it, but in reality it's very unlikely a user will type more than 1000 characters in those fields, so it would probably work if I declare the columns varchar(1000).

    My question is: what is the difference between the two designs? In both cases the storage size is 2 bytes plus the actual size, so why bother trying to guess the correct upper bound (which in most cases is way more than needed)? I know varchar has a limit of 8000, but my question applies only to those cases where you know the input will never exceed that number. Is there a performance hit when the columns are declared as varchar(max)?

  • Varchar(max) is a replacement for old-style LOB type text.

    If you set 'large value types out of row' to OFF with sp_tableoption you will end up having 8000 bytes stored directly in the table, but if you set it to ON, the table will store only 16 bytes and treat the rest as a LOB.

    If your data size doesn't exceed 8000 bytes, on the performance side you shouldn't suffer compared to varchar(8000).

    If your data exceeds 8000 bytes or you set large value types out of row to ON, ther will be an extra read to gather the information from the LOB.

    Please note that I have no direct experience of this, I'm only reporting what I have found around (BOL + Google). I've never used varchar(max) to store big strings in a table, I still use text/ntext, so you'd better wait for somebody with direct experience to answer this.

    What I can certainly say is that text/ntext needs an extra read in the LOB.

    Reagards

    Gianluca

    -- Gianluca Sartori

  • I do know that there is a rather significant difference in performance between VARCHAR(somenumber) AND VARCHAR(MAX) even if you never get past 8k with the VARCHAR(MAX) for things like Tally table splits. I've not tried the "in row" thing.

    Let me setup a simple test and we'll find out if it affects simple SELECTs.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here folks... give it a try...

    DECLARE @SomeVarChar VARCHAR(1000)

    SELECT @SomeVarChar = REPLICATE('123456789 ',100)

     

    PRINT '===== Create VARCHAR table ==========================================='

    SET STATISTICS TIME ON

    SELECT TOP 100000

    SomeID = IDENTITY(INT,1,1),

    SomeVarChar = CAST(@SomeVarChar AS VARCHAR(1000))

    INTO #TestTable1

    FROM Master.dbo.SysColumns t1

    CROSS JOIN Master.dbo.SysColumns t2

    SET STATISTICS TIME OFF

     

    PRINT '===== Create VARCHARMAX table ========================================'

    SET STATISTICS TIME ON

    SELECT TOP 100000

    SomeID = IDENTITY(INT,1,1),

    SomeVarCharMAX = CAST(@SomeVarChar AS VARCHAR(MAX))

    INTO #TestTable2

    FROM Master.dbo.SysColumns t1

    CROSS JOIN Master.dbo.SysColumns t2

    SET STATISTICS TIME OFF

     

    PRINT '===== Add PK to VARCHAR table ========================================'

    SET STATISTICS TIME ON

    ALTER TABLE #TestTable1

    ADD PRIMARY KEY CLUSTERED (SomeID)

    SET STATISTICS TIME OFF

     

    PRINT '===== Add PK to VARCHARMAX table ====================================='

    SET STATISTICS TIME ON

    ALTER TABLE #TestTable2

    ADD PRIMARY KEY CLUSTERED (SomeID)

    SET STATISTICS TIME OFF

     

    PRINT '===== Simple SELECT on VARCHAR ======================================='

    SET STATISTICS TIME ON

    SELECT SomeID, SomeVarChar

    FROM #TestTable1

    WHERE SomeID >= 99000

    SET STATISTICS TIME OFF

     

    PRINT '===== Simple SELECT on VARCHARMAX ===================================='

    SET STATISTICS TIME ON

    SELECT SomeID, SomeVarCharMAX

    FROM #TestTable2

    WHERE SomeID >= 99000

    SET STATISTICS TIME OFF

     

    --===== HouseKeeping

    DROP TABLE #TestTable1,#TestTable2

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Odd. Just ran that code against a SQL Server 2008 Express Edition database, and the VARCHAR(MAX) runs generally had higher CPU time but lower elapsed time than the VARCHAR(x) ones...

  • paul.knibbs (8/18/2009)


    Odd. Just ran that code against a SQL Server 2008 Express Edition database, and the VARCHAR(MAX) runs generally had higher CPU time but lower elapsed time than the VARCHAR(x) ones...

    I would be tempted to protect the CPU... but... you may also want to SET STATISTICS IO ON and see what's going on there, as well. As you know, both CPU and IO are precious recourses.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Here's the results from my machine (Quad-Core Intel Q9550 @ 2.83GHz, running Vista Ultimate 64-bit (SP2) and SQL Server 2005 Developer Edition 64-bit (SP3)):

    ===== Create VARCHAR table ===========================================

    SQL Server Execution Times:

    CPU time = 156 ms, elapsed time = 184 ms.

    (100000 row(s) affected)

    ===== Create VARCHARMAX table ========================================

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 285 ms.

    (100000 row(s) affected)

    ===== Add PK to VARCHAR table ========================================

    SQL Server parse and compile time:

    CPU time = 25 ms, elapsed time = 25 ms.

    SQL Server Execution Times:

    CPU time = 497 ms, elapsed time = 162 ms.

    SQL Server Execution Times:

    CPU time = 528 ms, elapsed time = 186 ms.

    ===== Add PK to VARCHARMAX table =====================================

    SQL Server parse and compile time:

    CPU time = 25 ms, elapsed time = 25 ms.

    SQL Server Execution Times:

    CPU time = 890 ms, elapsed time = 248 ms.

    SQL Server Execution Times:

    CPU time = 922 ms, elapsed time = 272 ms.

    ===== Simple SELECT on VARCHAR =======================================

    (1001 row(s) affected)

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 79 ms.

    ===== Simple SELECT on VARCHARMAX ====================================

    (1001 row(s) affected)

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 123 ms.

    Steve

    (aka smunson)

    :-):-):-)

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply