December 1, 2006 at 7:50 am
In the database that I work on, character fields were all defined as NVARCHAR or NCHAR. We recently decided that we don't need them as once supposed, so I wrote a script that does all the work necessary to change them to VARCHAR and CHAR. The major reason we wanted to do this is to cut our database size down. I have executed the script successfully (no more n*char fields) against a number of databases, but none of them get any smaller. One in particular was 1.2 GB, and I expected it to drop at least a couple hundred MB, but it didn't, even after doing a DBCC SHRINKDATABASE.
Can someone explain to me why I didn't see the results expected? Is it just a matter of ignorance on my part, or is there a technical reason?
Thanks,
Phillip
December 1, 2006 at 8:18 am
Not sure,
but once sql server has grown the files it will not shrink them unless you tell it to.
look up dbcc shrinkfile in BOL, or use enterprise manager to shrink the db
December 1, 2006 at 9:06 am
USE DbName
GO
DBCC UPDATEUSAGE(DbName)
exec dbo.sp_spaceused
You can also use the taskpad view of the database in EM. You should get an idea of how much free space you now have.
December 1, 2006 at 9:19 am
Thanks for the replies. I've done SHRINKFILE as well as SHRINKDATABASE and the UPDATEUSAGE. The size did go down from 1.2 GB to 1.1 GB, which is nice. I just did a little research and found that we have 1196 char/varchar columns with an average size of 59. With that many columns, I guess I expected a more dramatic size difference. Now that I think about it, I suppose it does make sense. Even though we may have, say, 250 VARCHAR(1000) columns, very few, if any, are actually storing 1000 characters. Plus, since many of them are not required, they aren't using any space to begin with. So even though NVARCHARs require twice the space to store the same value as a VARCHAR, the amount of disk savings realized is not as obvious.
Thanks again for the input.
Phillip
December 1, 2006 at 11:44 am
For any that might be interested, I did some querying and math and I'm thinking that real data savings comes in at about 21MB. My guess is that when SHRINKFILE moved data around, more data pages were freed which resulted in the 100MB reduction in disk usage. Since the disk savings weren't nearly as big as I expected (which really does make sense now), the more compelling reason (for me) to convert NVARCHARs to VARCHARs would be to avoid going over the 8060 byte limit of a row.
Learn something new every day.
Peace,
Phillip
December 1, 2006 at 11:52 am
So what formula did you come up it to figure out the saved space?
December 1, 2006 at 12:16 pm
It wasn't so much of a formula as it was calculating the average actual size of the varchar/char columns. All of our character fields were either nchar or nvarchar, so I calculated the space used by the fields currently as varchar/char and assumed that to be the amount of data actually saved (since varchar * 2 = nvarchar). I had to write a script that looped through all the tables getting the columns for each table and issuing a SELECT SUM(LEN(ColumnName)) for each column. Then I just did some averaging per row and table. Below is the query, though it is not commented. It can be run for all tables in the current database or just one by setting the @TableName variable. If you discover incorrect logic, please let me know.
Thanks,
Phillip
P.S. I don't know if there is a way on this forum to maintain script formatting. Sorry.
************************************************************************************
SET NOCOUNT ON
DECLARE
@TableName SYSNAME,
@ColumnName SYSNAME,
@sql VARCHAR(8000),
@TestMode BIT
SELECT
@TestMode = 0,
@TableName = NULL --'Contact'
IF EXISTS(
SELECT NULL
FROM tempdb..sysobjects
WHERE name LIKE '%#ColStats%'
)
DROP TABLE #ColStats
CREATE TABLE #ColStats
(
TableName SYSNAME,
ColumnName SYSNAME,
ColumnSize INT,
RecCnt INT,
CurrentUsage INT
)
DECLARE curTables CURSOR LOCAL FAST_FORWARD FOR
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
AND TABLE_NAME = ISNULL(@TableName, TABLE_NAME)
ORDER BY TABLE_NAME
OPEN curTables
FETCH NEXT FROM curTables INTO @TableName
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT INTO #ColStats (TableName, ColumnName, ColumnSize)
SELECT
C.TABLE_NAME,
C.COLUMN_NAME,
C.CHARACTER_MAXIMUM_LENGTH
FROM INFORMATION_SCHEMA.COLUMNS C
WHERE C.TABLE_NAME = @TableName
AND C.DATA_TYPE IN ('CHAR', 'VARCHAR')
ORDER BY C.COLUMN_NAME
DECLARE curColumns CURSOR LOCAL FAST_FORWARD FOR
SELECT ColumnName
FROM #ColStats
WHERE TableName = @TableName
OPEN curColumns
FETCH NEXT FROM curColumns INTO@ColumnName
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql = 'UPDATE #ColStats SET CurrentUsage = (SELECT SUM(LEN([' +
@ColumnName + '])) FROM ' + @TableName + ') WHERE TableName = ''' +
@TableName + ''' AND ColumnName = ''' + @ColumnName + ''''
IF @TestMode = 1
PRINT @sql
ELSE
EXEC(@SQL)
FETCH NEXT FROM curColumns INTO@ColumnName
END -- WHILE @@FETCH_STATUS = 0 curColumns
SET @sql = 'UPDATE #ColStats SET RecCnt = (SELECT COUNT(*) FROM ' + @TableName +
') WHERE TableName = ''' + @TableName + ''''
IF @TestMode = 1
PRINT @sql
ELSE
EXEC(@SQL)
CLOSE curColumns
DEALLOCATE curColumns
FETCH NEXT FROM curTables INTO @TableName
END -- WHILE @@FETCH_STATUS = 0 curTables
CLOSE curTables
DEALLOCATE curTables
IF @TestMode = 1
SELECT DISTINCT TableName
FROM #ColStats
ORDER BY TableName
ELSE
BEGIN
DECLARE @Results TABLE
(
TableName SYSNAME,
ColCnt INT,
TotalBytesAllowed INT,
AvgBytesAllowed DECIMAL(8,4),
TotalBytesUsed INT,
AvgBytesUsed DECIMAL(8,4),
RowsInTable INT,
TotalBytesSavedPerTable DECIMAL(18,4),
AvgBytesSavedPerRow DECIMAL(18,4)
)
INSERT INTO @Results
SELECT
TableName,
COUNT(ColumnName) ColCnt,
SUM(ISNULL(ColumnSize, 0)) TotalBytesAllowed,
SUM(ISNULL(ColumnSize, 0)) * 1.0 / COUNT(ColumnName) AvgBytesAllowed,
SUM(ISNULL(CurrentUsage, 0)) / CASE WHEN RecCnt > 0 THEN RecCnt ELSE 1 END TotalBytesUsed,
SUM(ISNULL(CurrentUsage, 0)) * 1.0 / COUNT(ColumnName) / CASE WHEN RecCnt > 0 THEN RecCnt ELSE 1 END AvgBytesUsed,
RecCnt RowsInTable,
SUM(ISNULL(CurrentUsage, 0)) * 1.0 / COUNT(ColumnName) TotalBytesSavedPerTable,
SUM(ISNULL(CurrentUsage, 0)) * 1.0 / COUNT(ColumnName) / CASE WHEN RecCnt > 0 THEN RecCnt ELSE 1 END AvgBytesSavedPerRow
FROM #ColStats
GROUP BY TableName, RecCnt
ORDER BY TableName
SELECT *
FROM @Results
SELECT
SUM(TotalBytesSavedPerTable) TotalBytesSaved,
SUM(TotalBytesSavedPerTable) * 1.0 / COUNT(*) AvgBytesSavedPerTable,
SUM(TotalBytesSavedPerTable) * 1.0 / 1024 TotalKBSaved,
(SUM(TotalBytesSavedPerTable) * 1.0 / COUNT(*)) / 1024 AvgKBSavedPerTable,
SUM(TotalBytesSavedPerTable) * 1.0 / (1024 * 1024) TotalMBSaved,
(SUM(TotalBytesSavedPerTable) * 1.0 / COUNT(*)) / (1024 * 1024) AvgMBSavedPerTable
FROM @Results
END
DROP TABLE #ColStats
December 5, 2006 at 2:41 pm
It has been a while, but I remember doing something similar and didn't set the space free up untill I re-indexed the tables that had datatype changes. It ended up being significant size reduction after the reindexing.
It can't hurt to try.
Tom
December 5, 2006 at 3:35 pm
Thanks for the post, Tom. I reindexed all tables, shrank the database again, did the update usage thing, and did another shrink file. The size is down to just under 1.1GB, which brings total savings to right about 130MB.
As I noted in an earlier post, this does seem reasonable to me, especially now that I've done the math research. Had more of these fields been indexed, there might have been more "reclaimed" after the reindex. But as it is, I doubt that more than 20% of the fields were indexed.
Peace,
Phillip
December 6, 2006 at 10:42 am
NVARCHAR/VARCHAR and NCHAR/CHAR are allocated differently in data pages. If you change data type from NVARCHAR to VARCHAR, you should not be able to release free space because of the nature SQL Server stores data. If you change data type from NCHAR to CHAR, you will, however.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply