CHAR vs VARCHAR

  • Just a quick thought... I know you said you can't normalize the table, but is that because of restrictions placed on you by the application that interacts with it? If so, could you perhaps normalize the table as a table(s) with a new name and create a view that could display all of the current data back to the application just as it currently is?

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Would vertical partitioning be of any benefit as you cannot normalise the data?  If columns SN10 - SN60 are empty for most rows then you might not need to have a row in this table and could reconstruct the data with a left outer join in a view.

    I'm sure it is not as simple as that but it's a thought.

     

    J

  • Played around with this a bit and must say that it's all very strange. It seems the space cannot be recovered with any kind of standard manipulation in Query Analyzer. Also, a char(N) column occupies N bytes even if it is null.

    I had the table

    create table b(

    s01 char(100) null,

    s02 char(100) null,

    s03 char(100) null,

    s04 char(100) null,

    s05 char(100) null)

    and stuffed 1 byte into each column generating 50000 records. This caused my database to go from 1024K to 31488K.

    I then did

    alter table b alter column s01 varchar(100) null

    alter table b alter column s02 varchar(100) null

    alter table b alter column s03 varchar(100) null

    alter table b alter column s04 varchar(100) null

    alter table b alter column s05 varchar(100) null

    which increased the database to 61696K (apart from taking a while). I did a shrink and got it down to 58856.

    I then did

    select *

    into b2

    from b

    drop table b

    followed by a shrink and got it down to 31488K - which is truly amazing.

    I tried creating the table

    create table b2(

    s01 varchar(100) null,

    s02 varchar(100) null,

    s03 varchar(100) null,

    s04 varchar(100) null,

    s05 varchar(100) null)

    followed by

    insert into b2

    select * from b

    drop table b

    followed by a shrink and it still is 31488K.

    The only way to recover the space is to export b and import it into b2. I did this via an external script using ADO. Now I got it down to 1500K which is what I expect.

  • Ok, I'm getting a little irritated with this forum. Why is it that if you take too long to write your post, when you try to post it clears your entry and then when you try to go back the message is cleared also! I had a couple paragraphs written about my problem and now it's gone! I know all of you are disappointed. (ha ha)

    Anyway, my problem is solved!! Here's what I did.

    1. ran LTRIM(RTRIM()) on all 120 SN** columns - no change yet

    2. ran UPDATE STATS & UPDATE USAGE - no change yet

    3. INDEXDEFRAG - removed about 80% of the pages

    Now the table is about 20% of the previous. I'm guessing that the table was just too wide and testing on one column just wasn't enough to reduce the size due to page sizes.


    Live to Throw
    Throw to Live
    Will Summers

  • I just make sure I do a copy of everything in the reply window before I hit submit... I learned that "posting" lesson on this and other forums... kinda like saving a doc before you hit the print button...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yea, I also try to copy the whole post before I hit Post Reply. Sometimes I forget. Seems that it messes up everytime I write a long post and haven't copied to clipboard. Just ranting.


    Live to Throw
    Throw to Live
    Will Summers

  • 1) varchars only save you space if the average DATA length is several chars < length of char column.

    2) Create new table with varchars.  Insert data to this table while RIGHT TRIMMING all fields.  You need to get rid of the trailing spaces the chars contain.  Consider doing this in batches of NNK rows to avoid locking table for long periods and also blowing up your tlog.  Backup tlog inbetween batches for same reason.

    3) Index new table as appropriate.

     

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • As I said -

    Most of the time the SN** columns are blank, hence they will be smaller than the length of the column

    I have already created the table with varchars, I did a LRTIM(RTRIM()) the fields after the insert (I did this in batches)

    I put a clustered index on and INDEXDEFRAGged and it solved the problem.

    I'm sorry maybe I didn't understand your post, but I'm not sure what you were trying to suggest except to do smaller batches. You just reiterated what was previously posted.


    Live to Throw
    Throw to Live
    Will Summers

  • Will,

    So you have a 7.5 GB database 99% of which contains blanks. I'm asking myself what was the orignal purpose of char columns? There must have once been a hard disk vendor that got together with a db vendor where the former said to the latter: if you put a char column into your database product I'll give you 10% of every hard disk sale.

    In anycase, rtrim() does trick.

  • Sorry Will, I did my post after reading the first (and I thought without checking) only page of posts for the initial post. 

    Had I posted earlier I could have saved you the updates and reclustering steps by trimming on insert.  C'est la vie.

    Glad you are all better now.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • ha ha. Good one.

    Out of the 60 SN** columns, only the first 10 are being used right now. The data in the table only goes back about a year now. I suspect that the columns were put out there for the possibilty of being used in the future.

    The columns that aren't SN** are populated with data.


    Live to Throw
    Throw to Live
    Will Summers

  • As always, I appreciate any help. I was just trying to summarize previous posts.


    Live to Throw
    Throw to Live
    Will Summers

  • You keep using the words "empty" and "blank" to refer to the contents of a column. These words have no meaning in the context of a relational database. If the column contains anything other than NULL, then it contains enough characters to completely fill the field length. The system will pad it out with spaces if it has to.

    One obvious question is how do you know a field is "empty" or "blank"? If you SELECT SPxx and column SPxx contains 60 spaces, you will not see anything. If you SELECT LEN(SPxx) you will get back a result of 0 (LEN does an RTRIM before counting the characters). However, if you SELECT '' you will see: "<60 spaces>" and SELECT LEN('') will return 62. So all these field can look "empty" but are, in fact, not at all empty and will take up disk space. Only if SELECT SPx or SELECT LEN(SPxx) return "NULL" is the field truly what you would consider empty and take up no disk space.

    I had a feeling when you used the word "empty" to describe these fields that they were not NULL. It was in, I believe, my first response that I advised that you had to trim each and every SPxx and SPxx_Name column when you transfered the data to the varchar table to see any size difference. I'm glad to see you finally did that and it worked.

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • Yes, I assumed that "empty" meant no NULLs. I assumed this because I had marked all the columns as NOT NULL.

    It wasn't just the trimming that worked. That was one of the first things that I tried.

    I had to change ALL the SN** to VARCHAR, have a clustered index on the table, trim, then INDEXDEFRAG on the clustered index to recover the lost space.

    I at first tried this with just 2 columns and no size change, then with 6 columns, no size change. The trimming was something that I tried before I even saw a response to my problem.


    Live to Throw
    Throw to Live
    Will Summers

  • This is a fortunate side effect of adding a clustered index plus doing an indexdefrag because it has to move around the contents of the data pages to get everything into the physical order implied by the index. But you have no guarantee for this to be optimal. Your optimal strategy is to copy the old 'char' table to the new 'varchar' table using rtrim followed by deleting the old table and finally doing a shrinkfile. It's probably a whole lot faster too.

Viewing 15 posts - 16 through 30 (of 33 total)

You must be logged in to reply to this topic. Login to reply