March 27, 2010 at 6:25 pm
We have a VLDB which hosts a SAP database. The largest table has 600 million rows with 1,4 TB data space and 1,2 TB index space (one clustered and 6 non-clustered indexes). After performing some deletions I tried the following:
- ALTER INDEX x REINDEX ON Table z WITH (ONLINE=ON) only for the clustered index
- Run sp_spaceused for the table before and after the reindexing procedure
The reeindexing took approximately 4 days.
Sp_spaceused for the specific table showed that index space reduced to 12 GB !!! but the database is smaller only by 300 GB.
I have the following questions:
1. Should I run dbcc updateusage for the table or the whole database in order to find the occupied space?
2. Should I first reindex the clustered index and then the non-clustered indexes or the opposite way?
Thanks in advance
PSA
March 28, 2010 at 2:52 am
pandrikopoulos (3/27/2010)
The reeindexing took approximately 4 days.Sp_spaceused for the specific table showed that index space reduced to 12 GB !!! but the database is smaller only by 300 GB.
There's not really enough detail in the original post to say whether this is reasonable or not.
I assume there is a good reason for rebuilding the table data completely? You don't say how many rows were deleted, or what motivated you to undertake this exercise.
Should I run dbcc updateusage for the table or the whole database in order to find the occupied space?
Neither, if you are running SQL Server 2005 or above, unless the database was upgraded from 2000, and you have never run it before. See DBCC UPDATEUSAGE in Books Online for details.
Should I first reindex the clustered index and then the non-clustered indexes or the opposite way?
It depends on why you are rebuilding. In SQL Server 2005 and above, rebuilding the clustered index does not automatically rebuild the non-clustered indexes, so it does not usually make much difference. See http://www.sqlskills.com/BLOGS/PAUL/post/Indexes-From-Every-Angle-What-happens-to-non-clustered-indexes-when-the-table-structure-is-changed.aspx
Four days seems awfully long.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 28, 2010 at 4:05 am
There's not really enough detail in the original post to say whether this is reasonable or not.
I assume there is a good reason for rebuilding the table data completely? You don't say how many rows were deleted, or what motivated you to undertake this exercise.
--I rebuild in order to find out how much space will be left and if it
--worths reindexing in production environment. The
--database is about 7,5 TB and SQL Server 2005 gives the ability to
--perform online reindexing. The table was never rebuilt before since
-- in SQL Server 2000 downtime is needed to perform such operations.
Neither, if you are running SQL Server 2005 or above, unless the database was upgraded from 2000, and you have never run it before. See DBCC UPDATEUSAGE in Books Online for details.
--How can you explain then the difference in sp_spaceused for the
--table and the disk usage of the database?
It depends on why you are rebuilding. In SQL Server 2005 and above, rebuilding the clustered index does not automatically rebuild the non-clustered indexes, so it does not usually make much difference. See http://www.sqlskills.com/BLOGS/PAUL/post/Indexes-From-Every-Angle-What-happens-to-non-clustered-indexes-when-the-table-structure-is-changed.aspx
--I 'm familiar with SQL Server 2005 changes regarding reindexing. I
--reindexed the clustered index because I think that more space will be
--left available. Apart from that rebuilding all the indexes will be time
--and space consuming. In order to reindex the clustered index 1 TB
--free space was needed.
PSA
March 28, 2010 at 4:51 am
The table was never rebuilt before since in SQL Server 2000 downtime is needed to perform such operations.
The question is, have you run DBCC UPDATEUSAGE on the whole database since the upgrade?
How can you explain then the difference in sp_spaceused for the table and the disk usage of the database?
See above.
I'm familiar with SQL Server 2005 changes regarding reindexing. I reindexed the clustered index because I think that more space will be left available.
What makes you think that? What analysis did you do? What benefit would you expect to see from having extra space available inside the database? There are good reasons for rebuilding a clustered index, but it is not normally about free space.
Apart from that rebuilding all the indexes will be time and space consuming. In order to reindex the clustered index 1 TB free space was needed.
So you have 1TB of free space? Now I am confused about what you are trying to achieve.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 28, 2010 at 5:11 am
The question is, have you run DBCC UPDATEUSAGE on the whole database since the upgrade?
--The upgrade took place 7 months ago and DBCC UPDATEUSAGE run
after it.
What makes you think that? What analysis did you do? What benefit would you expect to see from having extra space available inside the database? There are good reasons for rebuilding a clustered index, but it is not normally about free space.
--Since the application is space consuming (150 GB extra database space per month) I expect that reindexing will save a few GB. As you probably know deleted records are not physically removed in SQL Server for performance reasons. Reindexing procedure will free the unneeded space and will also defragment the fragmented indexes. The answer to your question is that I tried the reindexing in disaster environment in order to find out how much space will be freed and if the application will be performing better due to less fragmentation.
So you have 1TB of free space? Now I am confused about what you are trying to achieve.
See above.
March 28, 2010 at 5:49 am
when upgrading from SQL Server 2000 to 2005 you are well advised to run DBCC UPDATEUSAGE and rebuild all indexes.
Having said that when carrying out normal maintenance on a SQL Server 2005 database dont just blindly rebuild indexes. Query the DMV "sys.dm_db_index_physical_stats", if fragmentation is not too heavy then re organise is a better option
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" π
March 28, 2010 at 5:54 am
pandrikopoulos (3/28/2010)
As you probably know deleted records are not physically removed in SQL Server for performance reasons.
That is only strictly true for a heap, not a table with a clustered index. What I was really looking for here was confirmation that you have examined the avg_page_space_used_in_percent column on sys.dm_db_index_physical_stats to determine that there is significant free space to be released.
If so, and given your aims, I would expect you to run an ALTER INDEX REORGANIZE rather than REBUILD. An ONLINE rebuild is going to require significant version store space in tempdb if there are concurrent data modifications. It is quite likely that you will run out of tempdb, and have to wait while the entire operation rolls back.
A reorganize uses very few system resources, is highly concurrent, requires only one additional data page for temporary workspace, and can be stopped at any time, retaining the work done so far. Bear in mind that it is fully logged though - a point made even more important if the table contains LOB data. By default, LOB data is compacted during a reorganize of the clustered index.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 28, 2010 at 5:57 am
You might also want to consider partitioning this table at some stage, to make maintenance tasks easier. Local partitioned views can be made to work well if you do not run Enterprise Edition.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 28, 2010 at 6:03 am
Perry Whittle (3/28/2010)
when upgrading from SQL Server 2000 to 2005 you are well advised to run DBCC UPDATEUSAGE and rebuild all indexes.Having said that when carrying out normal maintenance on a SQL Server 2005 database dont just blindly rebuild indexes. Query the DMV "sys.dm_db_index_physical_stats", if fragmentation is not too heavy then re organise is a better option
Perry, our posts crossed π
I would go further though, when moving from 2000 to 2005 - I would re-create the database wherever possible. There were just so many things that changed, and don't work as one might expect if the database is upgraded in-place. Allocation unit ids, full-text, usage statistics, data purity...the list just goes on and on.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 28, 2010 at 9:49 am
Perry,
I agree in general with your suggestions but you should take into account that the database hosts a SAP application. SAP has some restrictions in the way a database is being administered such as:
- index reorganization is not advised.
- even if partitioning is supported in SQL 2005 SAP supports it only for SAP BW (SAP Data Warehouse platform) and not for other SAP applications
-at the time of upgrade from SQL 2000 to SQL 2005 the size of the database was about 6,5 TB. There was a window for updating statistics but I don't know how many days are needed for reorganizing all the indexes!!!
About reindexing vs reorganizing:
- Reorganize is less resource consuming as you mentioned
- Reorganize work is not lost if it stops unexpectedly
- Reindex has better results in fragmentation
- Reindex takes less time than reorganizing
- A new event exists in Profiler which audits the number of records that are inserted during index rebuild, and you can monitor reindex progress
The VLDB which is about 7,5 TB has two big tables: one with 700 million records and 8 indexes (2,5 TB total size) and one with 4.5 billion rows (!!!) and one clustered index (2 TB total size). I decided to test reindex procedure because it offers the online option and secondly, because the application is batch-only and there is a time window of 4-5 days to reindex the large tables even in offline mode.
The main reason for reindexing is to reduce the database size and reduce fragmentation which is above 60% in the 2 referenced tables.
You should see the following article of Paul Randal about ghost records that are deleted asynchronously under some circumstances:
http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/23/644607.aspx
Regards,
PSA
March 28, 2010 at 10:29 am
Do let us know how you get on rebuilding that clustered index online in production π :w00t: :laugh:
SAP don't advise index reorganization eh? Sigh.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 28, 2010 at 2:02 pm
pandrikopoulos (3/28/2010)
Perry,I agree in general with your suggestions but you should take into account that the database hosts a SAP application. SAP has some restrictions in the way a database is being administered such as:
- index reorganization is not advised.
- even if partitioning is supported in SQL 2005 SAP supports it only for SAP BW (SAP Data Warehouse platform) and not for other SAP applications
-at the time of upgrade from SQL 2000 to SQL 2005 the size of the database was about 6,5 TB. There was a window for updating statistics but I don't know how many days are needed for reorganizing all the indexes!!!
About reindexing vs reorganizing:
- Reorganize is less resource consuming as you mentioned
- Reorganize work is not lost if it stops unexpectedly
- Reindex has better results in fragmentation
- Reindex takes less time than reorganizing
- A new event exists in Profiler which audits the number of records that are inserted during index rebuild, and you can monitor reindex progress
The VLDB which is about 7,5 TB has two big tables: one with 700 million records and 8 indexes (2,5 TB total size) and one with 4.5 billion rows (!!!) and one clustered index (2 TB total size). I decided to test reindex procedure because it offers the online option and secondly, because the application is batch-only and there is a time window of 4-5 days to reindex the large tables even in offline mode.
The main reason for reindexing is to reduce the database size and reduce fragmentation which is above 60% in the 2 referenced tables.
You should see the following article of Paul Randal about ghost records that are deleted asynchronously under some circumstances:
http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/23/644607.aspx
Regards,
PSA
Hi
In my experience it doesnβt matter what it hosts. You would normally investigate a platform upgrade and plan for any oddities whatever the size. Microsoft recommends strongly that after an upgrade from SQL 2000 to 2005 you run DBCC UPDATEUSAGE, rebuild your indexes and run DBCC CHECKDB with DATA_PURITY. If the db is 7TB then you have a lot to do donβt you and need to factor this in to your planned\scheduled downtime. The upgrade from 2000 to 2005 is a large leap in terms of platform migrations.
Reindexing the database, its just going to grow in size due to the requirements during a reindex operation (free space required).
Paul White NZ (3/28/2010)
It is quite likely that you will run out of tempdb, and have to wait while the entire operation rolls back.
assuming a sort operation is necessary and you have specified to use tempdb (the default is off)
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" π
March 28, 2010 at 8:38 pm
Perry Whittle (3/28/2010)
Paul White NZ (3/28/2010)
It is quite likely that you will run out of tempdb, and have to wait while the entire operation rolls back....assuming a sort operation is necessary and you have specified to use tempdb (the default is off)
Not at all. I was explicitly referring to the version store space needed for concurrent data modification operations while the ONLINE index is being rebuilt. Assuming that there is significant activity on the production system, and the ONLINE operation is going to run for a while, the version store in tempdb may well run out of space.
See the following Microsoft Technical Article for details:
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 28, 2010 at 11:55 pm
SAP is very restrictive regarding SQL Server configuration settings, administrative options etc.
Since a SAP database is usually large enough, they only recommend to update statistics after an upgrade takes place.
March 29, 2010 at 3:27 am
Hi Paul
got that
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" π
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply