NVARCHAR to VARCHAR conversion

  • In the database that I work on, character fields were all defined as NVARCHAR or NCHAR. We recently decided that we don't need them as once supposed, so I wrote a script that does all the work necessary to change them to VARCHAR and CHAR. The major reason we wanted to do this is to cut our database size down. I have executed the script successfully (no more n*char fields) against a number of databases, but none of them get any smaller. One in particular was 1.2 GB, and I expected it to drop at least a couple hundred MB, but it didn't, even after doing a DBCC SHRINKDATABASE.

    Can someone explain to me why I didn't see the results expected? Is it just a matter of ignorance on my part, or is there a technical reason?

    Thanks,

    Phillip

  • Not sure,

    but once sql server has grown the files it will not shrink them unless you tell it to.

     

    look up dbcc shrinkfile in BOL, or use enterprise manager to shrink the db

  • USE DbName

    GO

    DBCC UPDATEUSAGE(DbName)

    exec dbo.sp_spaceused

     

    You can also use the taskpad view of the database in EM.  You should get an idea of how much free space you now have.

  • Thanks for the replies. I've done SHRINKFILE as well as SHRINKDATABASE and the UPDATEUSAGE. The size did go down from 1.2 GB to 1.1 GB, which is nice. I just did a little research and found that we have 1196 char/varchar columns with an average size of 59. With that many columns, I guess I expected a more dramatic size difference. Now that I think about it, I suppose it does make sense. Even though we may have, say, 250 VARCHAR(1000) columns, very few, if any, are actually storing 1000 characters. Plus, since many of them are not required, they aren't using any space to begin with. So even though NVARCHARs require twice the space to store the same value as a VARCHAR, the amount of disk savings realized is not as obvious.

    Thanks again for the input.

    Phillip

  • For any that might be interested, I did some querying and math and I'm thinking that real data savings comes in at about 21MB. My guess is that when SHRINKFILE moved data around, more data pages were freed which resulted in the 100MB reduction in disk usage. Since the disk savings weren't nearly as big as I expected (which really does make sense now), the more compelling reason (for me) to convert NVARCHARs to VARCHARs would be to avoid going over the 8060 byte limit of a row.

    Learn something new every day.

    Peace,

    Phillip

  • So what formula did you come up it to figure out the saved space?

  • It wasn't so much of a formula as it was calculating the average actual size of the varchar/char columns. All of our character fields were either nchar or nvarchar, so I calculated the space used by the fields currently as varchar/char and assumed that to be the amount of data actually saved (since varchar * 2 = nvarchar). I had to write a script that looped through all the tables getting the columns for each table and issuing a SELECT SUM(LEN(ColumnName)) for each column. Then I just did some averaging per row and table. Below is the query, though it is not commented. It can be run for all tables in the current database or just one by setting the @TableName variable. If you discover incorrect logic, please let me know.

    Thanks,

    Phillip

    P.S. I don't know if there is a way on this forum to maintain script formatting. Sorry.

    ************************************************************************************

    SET NOCOUNT ON

    DECLARE

    @TableName SYSNAME,

    @ColumnName SYSNAME,

    @sql VARCHAR(8000),

    @TestMode BIT

    SELECT

    @TestMode = 0,

    @TableName = NULL --'Contact'

    IF EXISTS(

    SELECT NULL

    FROM tempdb..sysobjects

    WHERE name LIKE '%#ColStats%'

    )

    DROP TABLE #ColStats

    CREATE TABLE #ColStats

    (

    TableName SYSNAME,

    ColumnName SYSNAME,

    ColumnSize INT,

    RecCnt INT,

    CurrentUsage INT

    )

    DECLARE curTables CURSOR LOCAL FAST_FORWARD FOR

    SELECT TABLE_NAME

    FROM INFORMATION_SCHEMA.TABLES

    WHERE TABLE_TYPE = 'BASE TABLE'

    AND TABLE_NAME = ISNULL(@TableName, TABLE_NAME)

    ORDER BY TABLE_NAME

    OPEN curTables

    FETCH NEXT FROM curTables INTO @TableName

    WHILE @@FETCH_STATUS = 0

    BEGIN

    INSERT INTO #ColStats (TableName, ColumnName, ColumnSize)

    SELECT

    C.TABLE_NAME,

    C.COLUMN_NAME,

    C.CHARACTER_MAXIMUM_LENGTH

    FROM INFORMATION_SCHEMA.COLUMNS C

    WHERE C.TABLE_NAME = @TableName

    AND C.DATA_TYPE IN ('CHAR', 'VARCHAR')

    ORDER BY C.COLUMN_NAME

    DECLARE curColumns CURSOR LOCAL FAST_FORWARD FOR

    SELECT ColumnName

    FROM #ColStats

    WHERE TableName = @TableName

    OPEN curColumns

    FETCH NEXT FROM curColumns INTO@ColumnName

    WHILE @@FETCH_STATUS = 0

    BEGIN

    SET @sql = 'UPDATE #ColStats SET CurrentUsage = (SELECT SUM(LEN([' +

    @ColumnName + '])) FROM ' + @TableName + ') WHERE TableName = ''' +

    @TableName + ''' AND ColumnName = ''' + @ColumnName + ''''

    IF @TestMode = 1

    PRINT @sql

    ELSE

    EXEC(@SQL)

    FETCH NEXT FROM curColumns INTO@ColumnName

    END -- WHILE @@FETCH_STATUS = 0 curColumns

    SET @sql = 'UPDATE #ColStats SET RecCnt = (SELECT COUNT(*) FROM ' + @TableName +

    ') WHERE TableName = ''' + @TableName + ''''

    IF @TestMode = 1

    PRINT @sql

    ELSE

    EXEC(@SQL)

    CLOSE curColumns

    DEALLOCATE curColumns

    FETCH NEXT FROM curTables INTO @TableName

    END -- WHILE @@FETCH_STATUS = 0 curTables

    CLOSE curTables

    DEALLOCATE curTables

    IF @TestMode = 1

    SELECT DISTINCT TableName

    FROM #ColStats

    ORDER BY TableName

    ELSE

    BEGIN

    DECLARE @Results TABLE

    (

    TableName SYSNAME,

    ColCnt INT,

    TotalBytesAllowed INT,

    AvgBytesAllowed DECIMAL(8,4),

    TotalBytesUsed INT,

    AvgBytesUsed DECIMAL(8,4),

    RowsInTable INT,

    TotalBytesSavedPerTable DECIMAL(18,4),

    AvgBytesSavedPerRow DECIMAL(18,4)

    )

    INSERT INTO @Results

    SELECT

    TableName,

    COUNT(ColumnName) ColCnt,

    SUM(ISNULL(ColumnSize, 0)) TotalBytesAllowed,

    SUM(ISNULL(ColumnSize, 0)) * 1.0 / COUNT(ColumnName) AvgBytesAllowed,

    SUM(ISNULL(CurrentUsage, 0)) / CASE WHEN RecCnt > 0 THEN RecCnt ELSE 1 END TotalBytesUsed,

    SUM(ISNULL(CurrentUsage, 0)) * 1.0 / COUNT(ColumnName) / CASE WHEN RecCnt > 0 THEN RecCnt ELSE 1 END AvgBytesUsed,

    RecCnt RowsInTable,

    SUM(ISNULL(CurrentUsage, 0)) * 1.0 / COUNT(ColumnName) TotalBytesSavedPerTable,

    SUM(ISNULL(CurrentUsage, 0)) * 1.0 / COUNT(ColumnName) / CASE WHEN RecCnt > 0 THEN RecCnt ELSE 1 END AvgBytesSavedPerRow

    FROM #ColStats

    GROUP BY TableName, RecCnt

    ORDER BY TableName

    SELECT *

    FROM @Results

    SELECT

    SUM(TotalBytesSavedPerTable) TotalBytesSaved,

    SUM(TotalBytesSavedPerTable) * 1.0 / COUNT(*) AvgBytesSavedPerTable,

    SUM(TotalBytesSavedPerTable) * 1.0 / 1024 TotalKBSaved,

    (SUM(TotalBytesSavedPerTable) * 1.0 / COUNT(*)) / 1024 AvgKBSavedPerTable,

    SUM(TotalBytesSavedPerTable) * 1.0 / (1024 * 1024) TotalMBSaved,

    (SUM(TotalBytesSavedPerTable) * 1.0 / COUNT(*)) / (1024 * 1024) AvgMBSavedPerTable

    FROM @Results

    END

    DROP TABLE #ColStats

  • It has been a while, but I remember doing something similar and didn't set the space free up untill I re-indexed the tables that had datatype changes.  It ended up being significant size reduction after the reindexing.

    It can't hurt to try.

    Tom

  • Thanks for the post, Tom. I reindexed all tables, shrank the database again, did the update usage thing, and did another shrink file. The size is down to just under 1.1GB, which brings total savings to right about 130MB.

    As I noted in an earlier post, this does seem reasonable to me, especially now that I've done the math research. Had more of these fields been indexed, there might have been more "reclaimed" after the reindex. But as it is, I doubt that more than 20% of the fields were indexed.

    Peace,

    Phillip

  • NVARCHAR/VARCHAR and NCHAR/CHAR are allocated differently in data pages. If you change data type from NVARCHAR to VARCHAR, you should not be able to release free space because of the nature SQL Server stores data. If you change data type from NCHAR to CHAR, you will, however.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply