Converting text to varchar data types

  • There is a table tblA with around 2.5 million rows where one column ("comment") has “text” data type.

    As there is no single record with len(comment) > 8K, I created new table tblA_new (copy of original one) where text was replaced by varchar(8000) data type. New table has exactly same indexes and FKs as original one.

    First of all I was confused when DBCC SHOWCONTIG shows that after conversion text to varchar(8000) number of Pages Scanned increased 6 times (30K for original table and 184K- after conversion). Actually I am not sure how correct is this data as from BOL we know that “DBCC SHOWCONTIG does not display data with ntext, text, and image data types”. But what is more surprising that from now query that involves tblA_new Clustered Index Scan requires 5-6 time longer (184K logical reads) than calling old tblA (30K logical reads).

    Why is that? I mean number of scanned pages increased, query is running slowly (probably it’s result of that increasing?). Any help will be appreciate. Thanks.

  • Table size went up because instead of just storing text pointers, you are now storing the data itself... that, of course, takes extra room. In fact, you now have 1 row per page instead of the possibility of many rows per page... that would also explain the number of scans going up because you simply have fewer rows per page than you did before.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    What you explained makes perfect sense and I thought about it.

    But what confused me is the fact that table size was not increased after conversion (I just double checked using sp_spaceused- it's around 1.6 GB for both tables). From other point of view this size should require around 200 K pages (that is close to 185 K of new table but certainly not 30 K as for original table).

    I made another test- created new table (using select top 1000000 into ...), i.e. this table comprises only part (40%) of original table with 2.5 M (and I kept text column). DBCC ShowContig points 58.5K pages. How it can be that part of records (1M) allocates more pages (58.5K) than all records (2.5M-> 30K)?

    Anyhow, thanks for your help.

  • sp_SpaceUsed may not tell the whole story unless you include a second parameter... see BOL for more info but here's the basic syntax...

    Syntax

    sp_spaceused [[@objname =] 'objname']

    [,[@updateusage =] 'updateusage']

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Actually I checked with "updateusage = true".

    Thanks

  • Then, I don't know...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi,

    Dbcc showcontig shows the fragmentation in the table why you are checking that for space used. ??

    "More Green More Oxygen !! Plant a tree today"

  • For space used actually I used "sp_spaceused". Dbcc Config I used to find number of allocated pages. If query requires Clustered Index Scan execution time depends on this number (if I am correct), i.e. more pages- longer run. Thanks

  • Why should the table size chage after you move from text to varchar? The total data is the same.

    What has happened, as Jeff says, you now normally have 1 row per page with each row including your Comment column varchar data. Previously you had many rows per page with your Comment text data held separately. The difference in how your data is stored would account for the performance differences you see.

    If most of your business queries need the Comment data, it may be best to hold it as varchar. If very few queries need the Comment data then it is likely to be best to hold it as text.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply