USER DEFINED DATA TYPES: Used in table creation - What happens if you change just the user defined type to varchar(max) from text?

  • Sean Lange (9/23/2015)


    sgmunson (9/23/2015)


    Gazareth (9/16/2015)


    Don't envy you this task, sounds like changing a database collation but less fun!

    I'd be fighting hard to change columns to built in datatypes rather than new/modified user defined types.

    For replication, check Subscription Options on your Publication properties - if Replicate Schema changes is 'True' you should be able to issue the ALTER TABLE... ALTER COLUMN commands on the Publisher just fine. In theory.

    This does need your subscribers to be SQL too though.

    As details have become more clear, I am no longer needing to worry about replication, but I do need to be able to see how the change from text or ntext to varchar(max) or nvarchar(max) will affect both space usage and whether or not the data is "in-row" or not. Our method is to run the ALTER TABLE with ALTER COLUMN command, followed by running an UPDATE to set the value of the affected column to a CAST of itself to the proper data type, and have a before and after picture of whether we are moving this data in-row or not, and in what quantities. I've been looking into a number of options:

    1.) Using a system catalog view: sys.dm_db_index_physical_stats

    2.) Using DBCC IND and/or DBCC PAGE to get more detailed information

    If anyone has any general guidance on a good method for tracking where this data ends up, I'd appreciate it.

    It is not going to consume less space and the data will still be off row when relevant. This doesn't change. What does change is the ability to search for this data with normal where predicates. You no longer have to do things like "where cast(MyTextColumn as varchar(max)) = 'MyValueToSearchFor'".

    If you are changing the datatype of the column there is no point in then running an update on the column to "change" the datatype. That happens as an implicit conversion when you change the datatype of the column.

    The objective is a performance improvement, and we're looking to see if the data is held "in-row" or not, and at what percentage of the rows this is true. If I can determine that the data is already small enough to fit in row, then I want to force any such data to occupy that "in-row" location. Am I on the wrong path?

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Still looking for guidance, if anyone has any, with regard to exactly how to go about determining what pages belong to a specific table, and then being able to see how many rows have their data "in-row".

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson (9/24/2015)


    Still looking for guidance, if anyone has any, with regard to exactly how to go about determining what pages belong to a specific table, and then being able to see how many rows have their data "in-row".

    dm_db_partition_stats.

    Although I'm not sure how you'd move them between in-row/lob storage. Presumably they're already in the appropriate one. Rebuild the clustered index maybe?

  • Gazareth (9/24/2015)


    sgmunson (9/24/2015)


    Still looking for guidance, if anyone has any, with regard to exactly how to go about determining what pages belong to a specific table, and then being able to see how many rows have their data "in-row".

    dm_db_partition_stats.

    Although I'm not sure how you'd move them between in-row/lob storage. Presumably they're already in the appropriate one. Rebuild the clustered index maybe?

    The existing data types are text or ntext, and I was under the impression that those data types were always stored off-row. I'm not sure if I can accurately interpret some of the language that discusses the topic of in-row vs otherwise to know what occurs with these data types in SQL 2008 R2, as opposed to what I knew to be true back in SQL 2000.

    As to re-building the clustered index, we have zero shot at that, as while we could get away with it in development, production is online 24x7x365, so that's a non-starter. What we're looking at doing is updating the column to itself, using CAST to try and be sure the optimizer doesn't short-circuit and end up doing nothing. We also know how much data there is in these fields, and as table space doesn't get re-claimed without DBCC CLEANTABLE, we're probably going to be increasing the amount of space used by going "in-row", and we'll be looking to see if the total space usage for the table and indexes increases and by how much.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply