One time database shrink catch-22 solution?

  • I am developing a plan for SQL Server 2012 EE to properly downsize and type (nchar to char types) some needlessly Unicode nvarchar(max) fields and would like to optimize the database size as part of the downtime by doing a one time shrink. Experiments have shown a 50% allocated space savings which is 11G of data.

    After reading and experimenting, it's evident that shrinking a database causes index fragmentation and rebuilding indexes causes the database to expand. A real Catch-22 situation. I don't want to leave 50% free space in the database which is 11G of disk storage in this case.

    Would the following be a decent approach for a one time shrink that would allow ending up with non-fragged indexes and up to date index statistics?

    oBackup w/ verification and duplicate backups.

    oDrop all indexes.

    oRebuild any too fat tables via copy into new tables, then drop and rename tables. This is currently working well.

    oShrink the database leaving a reasonable amount of free space.

    oRecreate all indexes that were dropped.

    oValidate the database and check fragmentation.

    Pointing out any caveats, suggestions, gotchas or alternatives to consider much appreciated.

    Thx, Dave

  • dave 67553 (10/15/2015)


    I am developing a plan for SQL Server 2012 EE to properly downsize and type (nchar to char types) some needlessly Unicode nvarchar(max) fields and would like to optimize the database size as part of the downtime by doing a one time shrink. Experiments have shown a 50% allocated space savings which is 11G of data.

    After reading and experimenting, it's evident that shrinking a database causes fragmentation and rebuilding indexes causes the database to expand. A real Catch-22 situation. I don't want to leave 50% free space in the database which is 11G of disk storage in this case.

    Would the following be a decent approach for a one time shrink that would allow ending up with non-fragged indexes and up to date index statistics?

    oBackup w/ verification and duplicate backups since turning off Full recovery.

    oConvert to Simple recovery mode.

    oDrop all indexes.

    oRebuild any too fat tables via copy into new tables, then drop and rename tables. This is currently working well.

    oShrink the database leaving a reasonable amount of free space.

    oRecreate all indexes that were dropped.

    oConvert back to Full recovery mode.

    oValidate the database and check fragmentation.

    Not sure yet if it is worthwhile going into Simple mode as some initial tests indicated minimal performance gains turning off the log, so alternatively I may leave the DB in Full mode and just shrink the log to more what is normal in our case.

    Pointing out any caveats, suggestions, gotchas or alternatives to consider much appreciated.

    Thx, Dave

    You have some fundamental misunderstandings.

    Shrinking a database likely will cause fragmentation, but re-building the indexes does not automatically cause the database to grow.

    If an index is taking up 10 GB, and has little fragmentation, then rebuilding it will not change the size very much. What will cause the index to grow is setting the fillfactor to something other than the default, or, if the index was heavily fragmented.

    Simple mode does not turn off logging. You cannot turn off logging. Simple mode truncates any portion of the log that is no longer used at every checkpoint. If you never need to recover to a point in time, then use simple mode. But if you would every need to recover to, for example, 2:00 PM today, then you will need to keep your database in full recovery and set up regular log backups.

    I'm not sure you need to go through all of these step to accomplish these goals. You should be able to shrink the data file, and rebuild the indexes. That should get your free space back.

    You may be better served to create a second database, create the tables in the new database, BCP out the data from the first database, BCP it back in to the new database, and then re-create your keys, indexes, constraints, views, procedures, functions, and so forth.

    The pain point here is recreating the permissions and users. Depending upon the complexity of your security, number of users, etc., this may be simple. Or it may not!

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Here is my proposal.

    First, Backup w/ verification and duplicate backups.

    o Drop all foreign keys, indexes, including clustered, from OLD tables that will be altered. It will free up workspace, and you won't need them, because we will drop them in later step.

    o Rename these OLD tables to <TABLENAME>_BAK

    o Create NEW tables using original names and new data types (ie: Char instead of NChar). However, don't create indexes or foreign keys at this point.

    o INSERT into the NEW tables from the OLD tables.

    o Verify row counts, etc.

    o Drop the OLD tables.

    o Create indexes on the NEW tables. Remember to create clustered indexes before non-clustered indexes.

    o Create foreign keys on the NEW tables.

    o Shrink the database, if you think it's important to free up space for use by other databases. But if the storage space is only used by this database, there is no point in shrinking.

    o Validate the database and check fragmentation.

    11 GB is not that big today, so I wouldn't expect this to take more than couple of hours on a medium powered server.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Michael L John (10/15/2015)


    Shrinking a database likely will cause fragmentation, but re-building the indexes does not automatically cause the database to grow.

    If an index is taking up 10 GB, and has little fragmentation, then rebuilding it will not change the size very much.

    Ooooo, careful now. Any index greater than 128 extents (which is only 8MB) will be created in it's entirety before the old index is dropped for index rebuilds no matter the level of fragmentation. Using the ONLINE option creates an "interim transaction table" (for lack of a better term) that can also add to growth. You can get around that a bit by disabling (no need to drop) an index before rebuilding it but you have to be careful about disabling indexes that may have FKs pointing to them.

    If the database has enough free space already, then the above may not cause any growth (as you say) but, after a shrink, I can almost guarantee that the MDF will grow by at last the size of the largest index rebuilt (didn't say reorganize) with a bit of working space to boot depending on what the growth settings are. SORT IN TEMPDB can help a little there.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Eric M Russell (10/15/2015)


    Create indexes on the NEW tables. Remember to create clustered indexes before non-clustered indexes.

    As with a clustered index rebuild, the HEAP will be preserved until the CI is created. If the table is large, the "wasted" free space will also be large.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Michael L John (10/15/2015)

    <snip>

    You have some fundamental misunderstandings.

    Shrinking a database likely will cause fragmentation, but re-building the indexes does not automatically cause the database to grow.

    If an index is taking up 10 GB, and has little fragmentation, then rebuilding it will not change the size very much. What will cause the index to grow is setting the fillfactor to something other than the default, or, if the index was heavily fragmented.

    Simple mode does not turn off logging. You cannot turn off logging. Simple mode truncates any portion of the log that is no longer used at every checkpoint. If you never need to recover to a point in time, then use simple mode. But if you would every need to recover to, for example, 2:00 PM today, then you will need to keep your database in full recovery and set up regular log backups.

    <snip>

    In my case rebuilding the indexes did add most of the space back. And I misspoke on the logging in the sense that no it did not turn it off but full recovery model used a LOT more log space than the simple model did I think due to the fact I was not backing up the log between modifications in full model, and was getting auto checkpoints at 70% full in simple model. I removed that comment from my question. Thx.

  • Eric M Russell (10/15/2015)


    Here is my proposal.

    First, Backup w/ verification and duplicate backups.

    o Drop all foreign keys, indexes, including clustered, from OLD tables that will be altered. It will free up workspace, and you won't need them, because we will drop them in later step.

    o Rename these OLD tables to <TABLENAME>_BAK

    o Create NEW tables using original names and new data types (ie: Char instead of NChar). However, don't create indexes or foreign keys at this point.

    o INSERT into the NEW tables from the OLD tables.

    o Verify row counts, etc.

    o Drop the OLD tables.

    o Create indexes on the NEW tables. Remember to create clustered indexes before non-clustered indexes.

    o Create foreign keys on the NEW tables.

    o Shrink the database, if you think it's important to free up space for use by other databases. But if the storage space is only used by this database, there is no point in shrinking.

    o Validate the database and check fragmentation.

    11 GB is not that big today, so I wouldn't expect this to take more than couple of hours on a medium powered server.

    Thx. Perhaps not clear but that is essentially what I'm doing. I was only dropping the indices involved in the tables that had the field type changes and were being copied, and was creating new tables and dropping old and then renaming new rather than renaming current to old first. I verify the counts when I'm all done. Turns out there was in fact a benefit to switching to Simple model recovery since the log did not get anywhere near so large as it did in full model recovery. Only slightly larger than it began at 11G rather than running out of space on the disk at 27G. I've skipped the shrinking step since that added more downtime and after rebuilding indices, added most of the space back anyway so decided I will just enjoy the extra free space since I'll at least get the benefits of smaller backups and the database likely won't grow for quite a while. I also got some benefit out of rebuilding some of the larger fragmented indices on tables that were not involved in the table redefinitions in order to free up more space before the table conversions so it did not have to expand the database as much.

  • Jeff Moden (10/15/2015)


    Eric M Russell (10/15/2015)


    Create indexes on the NEW tables. Remember to create clustered indexes before non-clustered indexes.

    As with a clustered index rebuild, the HEAP will be preserved until the CI is created. If the table is large, the "wasted" free space will also be large.

    Thx for the comments Jeff. I checked my script, and the PK CI is created as part of the create table and I had no separate such create index for any PK.

  • My suggestion is somewhat different to the previous ones

    😎

    1) Safety measures first, Backup the database with verify etc.

    2) Create a new file group, add a new data file to it and make it the default file group.

    3) Profile the columns (see sample script below) to check for extended characters in the NVARCHAR columns.

    4) Clean up if anything found in 3), all extended characters will be converted to a question mark after the next step.

    5) Run ALTER TABLE ALTER COLUMN on all the columns that need changing.

    6) Move all the user tables and indices to the new file group.

    7) Shrink the original data file with TRUNCATEONLY;

    USE TEEST;

    GO

    SET NOCOUNT ON;

    -- Simple test set with extended character text from sys.messages.

    IF OBJECT_ID(N'dbo.TBL_TEST_NCHAR') IS NOT NULL DROP TABLE dbo.TBL_TEST_NCHAR;

    CREATE TABLE dbo.TBL_TEST_NCHAR

    (

    TNC_ID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_DBO_TBL_TEST_NCHAR_TNC_ID PRIMARY KEY CLUSTERED

    ,TNC_NTXT NVARCHAR(1000) NULL

    );

    ;WITH BASE_DATA AS

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY SMSG.language_id

    ORDER BY SMSG.severity DESC

    ) AS SMSG_RID

    ,SMSG.text

    FROM sys.messages SMSG

    )

    INSERT INTO dbo.TBL_TEST_NCHAR (TNC_NTXT)

    SELECT

    BD.text

    FROM BASE_DATA BD

    WHERE BD.SMSG_RID < 11

    ;

    /*

    -- This conversion from nvarchar to varchar will work but the results

    -- will not be too useful as all extended characters will be converted

    -- to a question mark.

    BEGIN TRY

    ALTER TABLE dbo.TBL_TEST_NCHAR ALTER COLUMN TNC_NTXT VARCHAR(1000) NOT NULL;

    RAISERROR(N'ALTER COLUMN SUCCESSFUL',0,0) WITH NOWAIT;

    SELECT * FROM dbo.TBL_TEST_NCHAR

    END TRY

    BEGIN CATCH

    RAISERROR(N'ALTER COLUMN FAILED',0,0) WITH NOWAIT;

    END CATCH

    -- Sample clean up, simply delete the rows where there are any extended characters

    ;WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    DELETE FROM dbo.TBL_TEST_NCHAR

    WHERE TNC_ID IN (

    SELECT DISTINCT

    TNC.TNC_ID

    --,TNC.TNC_NTXT

    FROM dbo.TBL_TEST_NCHAR TNC

    CROSS APPLY

    (

    SELECT TOP(LEN(TNC.TNC_NTXT)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N

    FROM T T1,T T2,T T3,T T4 --10^3

    ) AS NM

    WHERE UNICODE(SUBSTRING(TNC.TNC_NTXT, NM.N , 1)) > 255

    );

    -- */

    -- This query will return any row with extended characters;

    ;WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    SELECT DISTINCT

    TNC.TNC_ID

    ,TNC.TNC_NTXT

    FROM dbo.TBL_TEST_NCHAR TNC

    CROSS APPLY

    (

    SELECT TOP(LEN(TNC.TNC_NTXT)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N

    FROM T T1,T T2,T T3,T T4 --10^3

    ) AS NM

    WHERE UNICODE(SUBSTRING(TNC.TNC_NTXT, NM.N , 1)) > 255;

  • Eirikur Eiriksson (10/16/2015)


    My suggestion is somewhat different to the previous ones

    😎

    1) Safety measures first, Backup the database with verify etc.

    2) Create a new file group, add a new data file to it and make it the default file group.

    3) Profile the columns (see sample script below) to check for extended characters in the NVARCHAR columns.

    4) Clean up if anything found in 3), all extended characters will be converted to a question mark after the next step.

    5) Run ALTER TABLE ALTER COLUMN on all the columns that need changing.

    6) Move all the user tables and indices to the new file group.

    7) Shrink the original data file with TRUNCATEONLY;

    USE TEEST;

    GO

    SET NOCOUNT ON;

    -- Simple test set with extended character text from sys.messages.

    IF OBJECT_ID(N'dbo.TBL_TEST_NCHAR') IS NOT NULL DROP TABLE dbo.TBL_TEST_NCHAR;

    CREATE TABLE dbo.TBL_TEST_NCHAR

    (

    TNC_ID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_DBO_TBL_TEST_NCHAR_TNC_ID PRIMARY KEY CLUSTERED

    ,TNC_NTXT NVARCHAR(1000) NULL

    );

    ;WITH BASE_DATA AS

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY SMSG.language_id

    ORDER BY SMSG.severity DESC

    ) AS SMSG_RID

    ,SMSG.text

    FROM sys.messages SMSG

    )

    INSERT INTO dbo.TBL_TEST_NCHAR (TNC_NTXT)

    SELECT

    BD.text

    FROM BASE_DATA BD

    WHERE BD.SMSG_RID < 11

    ;

    /*

    -- This conversion from nvarchar to varchar will work but the results

    -- will not be too useful as all extended characters will be converted

    -- to a question mark.

    BEGIN TRY

    ALTER TABLE dbo.TBL_TEST_NCHAR ALTER COLUMN TNC_NTXT VARCHAR(1000) NOT NULL;

    RAISERROR(N'ALTER COLUMN SUCCESSFUL',0,0) WITH NOWAIT;

    SELECT * FROM dbo.TBL_TEST_NCHAR

    END TRY

    BEGIN CATCH

    RAISERROR(N'ALTER COLUMN FAILED',0,0) WITH NOWAIT;

    END CATCH

    -- Sample clean up, simply delete the rows where there are any extended characters

    ;WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    DELETE FROM dbo.TBL_TEST_NCHAR

    WHERE TNC_ID IN (

    SELECT DISTINCT

    TNC.TNC_ID

    --,TNC.TNC_NTXT

    FROM dbo.TBL_TEST_NCHAR TNC

    CROSS APPLY

    (

    SELECT TOP(LEN(TNC.TNC_NTXT)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N

    FROM T T1,T T2,T T3,T T4 --10^3

    ) AS NM

    WHERE UNICODE(SUBSTRING(TNC.TNC_NTXT, NM.N , 1)) > 255

    );

    -- */

    -- This query will return any row with extended characters;

    ;WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    SELECT DISTINCT

    TNC.TNC_ID

    ,TNC.TNC_NTXT

    FROM dbo.TBL_TEST_NCHAR TNC

    CROSS APPLY

    (

    SELECT TOP(LEN(TNC.TNC_NTXT)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N

    FROM T T1,T T2,T T3,T T4 --10^3

    ) AS NM

    WHERE UNICODE(SUBSTRING(TNC.TNC_NTXT, NM.N , 1)) > 255;

    Thx SSCarpal. Have to head into the office for meetings with the other folks so will digest this later.

  • Jeff Moden (10/15/2015)


    Eric M Russell (10/15/2015)


    Create indexes on the NEW tables. Remember to create clustered indexes before non-clustered indexes.

    As with a clustered index rebuild, the HEAP will be preserved until the CI is created. If the table is large, the "wasted" free space will also be large.

    After creating the clustered index, you must be talking about unallocated space within the database file, not wasted space within the table, which should be 100% un-fragmented at that point.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (10/16/2015)


    Jeff Moden (10/15/2015)


    Eric M Russell (10/15/2015)


    Create indexes on the NEW tables. Remember to create clustered indexes before non-clustered indexes.

    As with a clustered index rebuild, the HEAP will be preserved until the CI is created. If the table is large, the "wasted" free space will also be large.

    After creating the clustered index, you must be talking about unallocated space within the database file, not wasted space within the table, which should be 100% un-fragmented at that point.

    Correct. Unallocated space within the database file. A particularly nasty problem if you're trying to make a partition of a table Read Only. I've solved that problem for partitioned tables but that's easy compared to trying to solve the problem for all the tables in a database.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply