Datatype question

  • I have a table (below) that exists on several servers on our WAN throughout the country. I recently started changing the ImageName field from a CHAR to a VARCHAR as the average length of the field is only about 50 and it was wasting a lot of space. The tables contain between 500K and 3 million rows depending on the server they are on.

    CREATE TABLE [dbo].[TableName]

    ([ID] [int] IDENTITY(1,1) NOT NULL,

    [Form_ID] [int] NOT NULL,

    [ImageName] [char](512) NOT NULL,

    CONSTRAINT [PK_Imaging] PRIMARY KEY CLUSTERED

    ( [ID] ASC ))

    -- Nonclustered index on Form_ID also exists

    At our corporate office which is between 100 and 1000 miles away from all of the servers this table lives on, I have a daily job that does the following type of thing via a linked servers:

    DECLARE @VariableName CHAR(512)

    DECLARE @FORM_ID INT

    DECLARE @sql VARCHAR(4000)

    SET @VariableName = '\\UNCPATH\FOLDER\Filename.txt'

    SET @FORM_ID = 32532

    SET @sql =

    '

    UPDATE ServerName.DatabaseName.dbo.TableName

    SET ImageName = ''' + @VariableName + '''''

    FROM ServerName.DatabaseName.dbo.TableName

    WHERE Form_ID = ' + CONVERT(VARCHAR(12), @FORM_ID)

    EXECUTE (@SQL)

    Immediately after changing the data type from CHAR to VARCHAR, this process started taking as much as an hour or more to update a single record (was nearly instant before). Additionally, we observed large amounts of network traffic going from the updated server to the corporate server (I can only guess that it was for some reason bringing the whole table up to the corporate office for some reason). Changing the datatype of the @VariableName variable to VARCHAR immediately fixed the problem and it went back to running nearly instantly. Was wondering if someone could explain exactly why this behavior happens? Since this is building a dynamic SQL string, I'm not sure why the datatype of that @VariableName matters.

    The Redneck DBA

  • You're running into issues with padding and implicit conversion.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (6/7/2011)


    You're running into issues with padding and implicit conversion.

    I figured something along those lines, and agree it wasn't ideal to have it be a CHAR type to start with, but am not sure exactly why it would cause massive amounts of network traffic (I assume it's brining the whole table up to our corporate office for some reason) to do the update. I would understand it if that CHAR/VARCHAR field was in the WHERE clause, but it's not...it's just what it's being updated too.

    The Redneck DBA

  • You may be encountering page splits on update now that you have a variable length column in the table.

    What is the fill factor setting on your clustered index?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (6/7/2011)


    You may be encountering page splits on update now that you have a variable length column in the table.

    What is the fill factor setting on your clustered index?

    Hadn't thought of that...it's possible that is happening. The fill factor is 80%. In fact, since it was padding that value with spaces, I bet there is a pretty good chance of page splits.

    But would that cause SQL to suddenly feel the need to ship the whole table to the calling server when that field isn't involved in the WHERE clause?

    Mabye I'm overthinking it and should just let it go now that it's working again...just strikes me as being strange that it caused that much trouble for us.

    The Redneck DBA

  • Jason Shadonix (6/7/2011)


    But would that cause SQL to suddenly feel the need to ship the whole table to the calling server when that field isn't involved in the WHERE clause?

    Not from what I know of how Linked Servers work...however using 4-part naming leaves a lot of decisions up to the engine. For an operation like this I prefer to use EXEC...AT to guarantee remote execution. Something like this modifying your original snippet:

    SET @sql =

    '

    UPDATE DatabaseName.dbo.TableName

    SET ImageName = ''' + @VariableName + '''''

    FROM ServerName.DatabaseName.dbo.TableName

    WHERE Form_ID = ' + CONVERT(VARCHAR(12), @FORM_ID)

    EXECUTE (@SQL) AT [ServerName]

    As an aside, one other thing I noticed when I run your SQL that I don't think is related to your perf issue but that you may want to clean up is that you're using a CHAR for @VariableName so your resulting value has lots of trailing spaces. The engine will "trim" those away on storing a VARCHAR so they won't take up space on the page and it will just store the position of the last byte, but the trailing spaces will be there when you select the data out and so may affect display.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Cool! I never realized you could put AT in your EXEC statements. Neat!

    Thanks!

    Jason

    The Redneck DBA

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply