January 4, 2009 at 8:31 am
We have a 100+GB database that eventually with grow to 500GB+. Reindexing takes a long time and the database is on a SAN. What things can be done to speed up reindexing and update statistics on very large databases?
Data and Log files are on their own RAID 10 SAN drives. We are seeing disk bottlenecks on the data drive so we will be separating the large database from all other database for this particular application. We are also considering moving indexes to a dedicated RAID 10 drive.
What else can be done to speed up reindexing and update stats?
Thanks, Dave
January 4, 2009 at 8:52 am
DBADave (1/4/2009)What else can be done to spped things up?
I would work in fine tunning the application focusing on the most used queries.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.January 4, 2009 at 9:06 am
DBADave (1/4/2009)
We have a 100+GB database that eventually with grow to 500GB+.
A 100GB database is not large. Even a 500GB doesn't qualify as very large.
Reindexing takes a long time and the database is on a SAN. What things can be done to speed up reindexing and update statistics on very large databases?
Only rebuild the indexes that really need it. (Large indexes with fragmentation > 30%)
Only update statistics that you know are problematic. Be aware that a reindex updates stats on the index
Data and Log files are on their own RAID 10 SAN drives. We are seeing disk bottlenecks on the data drive so we will be separating the large database from all other database for this particular application. We are also considering moving indexes to a dedicated RAID 10 drive.
That'll only help if the IO system is the root cause of the perf problems, which is not usually the case.
Find the slowest queries or most IO intensive queries that you have (profiler works well here). Take the top 5 or so and optimise them. That may involve changing code, it may involve changing indexes, it usually involves both.
Implement the changes, take another profiler trace, find the next 5 worse-performing queries and fix those.
Repeat until system performance is acceptable.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 4, 2009 at 10:30 pm
So it sounds like I can't speed up the reindexing. I consider this application on the fast track to becoming large. It is projected to be over one terabyte within 2 years and will grow by about 250GB per year with 18 databases. The largest database is 110GB and may grow by 15-20GB per week, but it is still too soon to tell. We already have conditional rebuild logic built into the process, only performing Reorg or Rebuild where appropriate. I understand the part about tuning the code, but we definitely have disk bottlenecks. I'm constantly running Perfmon traces checking both disk queuing and Avg Sec/Read, Avg Disk Sec/Write and Avg Disk Sec/Transfer. Queuing isn't the best indicator, but Avg Disk Sec/Read is 68 milliseconds over a 24 hour period and if including reindexing over a 48 hour period the average is 119 milliseconds. That's far from where SAN performance should be. Avg Disk Sec/Write isn't as bad, but is still 38 milliseconds in a 24 hour period. Not good. That's just the data drive. The log drive is also showing poor performance, although not as bad and most are during the transaction log backups that occur every 15 minutes. The transaction log backups should not tax a SAN to the point of a disk bottleneck every 15 minutes. I have suspected SAN configuration issues, but that has yet to be determined. Our top two wait types are CXPacket and PAGEIOLATCH_SH. CXPacket is a parallelism problem we've warned the developers about. Most of the CXPackets can be prevented by reworking their JOINS. But PAGEIOLATCH_SH indicates a disk bottleneck. The wait times for these two types are more then double the third largest wait type. But for now I am trying to speed up reindexing, if possible, while also working with our server team to review the SAN configuration and check for physical disk fragmentation on the SAN as a possible issue.
Thanks, Dave
January 4, 2009 at 11:54 pm
Hello,
If your environment is Production & 24x 7; You can think about Indexdefrag instead of reindexing. Also if you are using the SQL Server 2005 Ent you can do the reindexing online. Keep in mind Indexdefrag will not update statistics.
But definitely this won't reduce the time for Maintenance.
You need to findout the required objects in db to be reindexed or indexdefrag as mention by Gila Shaw.
Hope this will help.
---------------------------------------------------
"Thare are only 10 types of people in the world:
Those who understand binary, and those who don't."
January 5, 2009 at 12:07 am
DBADave (1/4/2009)
Queuing isn't the best indicator, but Avg Disk Sec/Read is 68 milliseconds over a 24 hour period and if including reindexing over a 48 hour period the average is 119 milliseconds. That's far from where SAN performance should be. Avg Disk Sec/Write isn't as bad, but is still 38 milliseconds in a 24 hour period. Not good.
That's pretty decent. Usual recommendations for those counters are 10-50ms. I've seen systems where those are as high as 1.2 seconds.
Are the drives in the SAN Luns dedicated? Is there contention on the switch?
Do you have SAN-specific performance monitoring tools? If not, ask the san vendor for ones specific to your san.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 5, 2009 at 12:18 am
Oh, one other thought.
If you have a DB growing that fast with performance problems, you may want to consider getting a specialist in to analyse and make recommendations. Especially if you have no one who really knows that stuff in-house.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 5, 2009 at 1:56 am
DBADave (1/4/2009)
We have a 100+GB database that eventually with grow to 500GB+. Reindexing takes a long time and the database is on a SAN. What things can be done to speed up reindexing and update statistics on very large databases?Data and Log files are on their own RAID 10 SAN drives. We are seeing disk bottlenecks on the data drive so we will be separating the large database from all other database for this particular application. We are also considering moving indexes to a dedicated RAID 10 drive.
What else can be done to speed up reindexing and update stats?
Thanks, Dave
try this query, which reindex/reorganize only fragmented index.
DECLARE @sql nvarchar(max)
DECLARE @dbname VARCHAR(max)
DECLARE c_db cursor FAST_FORWARD FOR
select name from sys.sysdatabases
OPEN c_db
FETCH NEXT FROM c_db INTO @dbname
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql =
'use ['+ @dbname +'];
SET NOCOUNT ON;
DECLARE @objectid int;
DECLARE @indexid int;
DECLARE @partitioncount bigint;
DECLARE @schemaname nvarchar(max);
DECLARE @objectname nvarchar(max);
DECLARE @indexname nvarchar(max);
DECLARE @partitionnum bigint;
DECLARE @partitions bigint;
DECLARE @frag float;
DECLARE @command nvarchar(max);
-- Conditionally select tables and indexes from the sys.dm_db_index_physical_stats function
-- and convert object and index IDs to names.
declare @dbid bigint;
set @dbid = DB_ID();
print ''Database :''+ convert(varchar,@dbid) +''-''+ db_name(convert(varchar,@dbid));
SELECT
object_id AS objectid,
index_id AS indexid,
partition_number AS partitionnum,
avg_fragmentation_in_percent AS frag
INTO #work_to_do
FROM sys.dm_db_index_physical_stats (@dbid, NULL, NULL , NULL, ''LIMITED'')
WHERE avg_fragmentation_in_percent > 10.0 AND index_id > 0;
-- Declare the cursor for the list of partitions to be processed.
DECLARE partitions CURSOR FOR SELECT * FROM #work_to_do;
-- Open the cursor.
OPEN partitions;
-- Loop through the partitions.
WHILE (1=1)
BEGIN;
FETCH NEXT
FROM partitions
INTO @objectid, @indexid, @partitionnum, @frag;
IF @@FETCH_STATUS < 0 BREAK;
SELECT @objectname = QUOTENAME(o.name), @schemaname = QUOTENAME(s.name)
FROM sys.objects AS o
JOIN sys.schemas as s ON s.schema_id = o.schema_id
WHERE o.object_id = @objectid;
SELECT @indexname = QUOTENAME(name)
FROM sys.indexes
WHERE object_id = @objectid AND index_id = @indexid;
SELECT @partitioncount = count (*)
FROM sys.partitions
WHERE object_id = @objectid AND index_id = @indexid;
-- 30 is an arbitrary decision point at which to switch between reorganizing and rebuilding.
IF @frag < 30.0
SET @command = N''ALTER INDEX '' + @indexname + N'' ON '' + @schemaname + N''.'' + @objectname + N'' REORGANIZE'';
IF @frag >= 30.0
SET @command = N''ALTER INDEX '' + @indexname + N'' ON '' + @schemaname + N''.'' + @objectname + N'' REBUILD'';
IF @partitioncount > 1
SET @command = @command + N'' PARTITION='' + CAST(@partitionnum AS nvarchar(10));
EXEC (@command);
PRINT N''Executed: '' + @command;
END;
-- Close and deallocate the cursor.
CLOSE partitions;
DEALLOCATE partitions;
-- Drop the temporary table.
DROP TABLE #work_to_do;
'
--PRINT (@SQL)
EXEC(@SQL)
FETCH NEXT FROM c_db INTO @dbname
END
CLOSE c_db
DEALLOCATE c_db
January 5, 2009 at 7:53 am
I appreciate the script, but we are already performing defragmentation based upon different fragmentation levels.
I'm working on an Online defragmentation script, but unfortunately you can't run online defragmentation on tables containing large object data type columns. That being said we should see some benefits from online defrag.
Hi Gila,
In regards to the counters I am seeing I still believe they indicate a bottleneck based upon what I've read on the Internet. However, to play it safe I contacted EMC this morning to ask what counters I should be looking at and what values are considered acceptable. I will update this post when they reply.
Thanks everyone
Dave
January 5, 2009 at 8:18 am
Several of my customers have had the same issues and I now use the defrag, I personally have yet to run into a situation where a full reindex was absolutely required. (that doesn't mean I never will, just that I never have) We have a good handful of Half billion + row tables in our data warehouses and this has worked very effectively with little impact on the system. You can stop the defrag during peak hours and resume again when the server is quiet. Perhaps there are others who can tell you why this wont work, I just have never been bitten by this wonderful little DBCC process. While getting caught in a reindex-gone-wild has been, to the contrary, a somewhat nasty and embarrassing experience, which I now studiously avoid on large tables - One Caveat... I'm a wiener who does consider 100 million rows to be large (no rowcount envy here). If you want to call attention to yourself, use the reindex. On the other hand, (IMHO) a defrag should be applied often (to moderate to large, fragmented tables) and quietly so.
January 5, 2009 at 9:12 am
DBADave,
The next time the Index Maintenance is running, check if it specifically is having the CXPacket waits that you describe. If it is, then add the MAXDOP 1 option to your rebuild scripts and remove parallelism from the process. In my environment on a 300GB database, this meant the difference in 12 hr rebuilds and 3 hr rebuilds and these were online in Enterprise Edition. It also reduced the PageLatch_IO waits because the disks stopped having to try and serve multiple parallel processes.
How many data files are there in this database currently, and how are they allocated to different file groups. One thing I did a few months back after talking with Paul Randal and reading some of his posts was to consolidate back down to one massive file per filegroup, and this improved performance dramatically in my larger databases. Unless you are splitting files for filegroup managebility, space restrictions or because you are having bitmap contention similar to what tempdb can have from object turnover, you probably would be best served with a single file for the database.
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
January 5, 2009 at 9:24 am
I join to the "FREQUENT REORGANIZE" idea... In my 24x7 Servers Reindexing is banned!
* Noel
January 5, 2009 at 11:47 am
I'll give the MAXDOP 1 option a try. We log all activity every 10 minutes so I checked the logs for the time period when the reindex job last ran and I can see one CXPACKET and eight PAGEIOLATCH_SH waits for the same process every 10 minutes while it was running. Good suggestion. Thanks.
Currently we only have one data file per database. I looked into using multiple data files, but found some msdn blogs recently where Microsoft engineers said it was a myth that utilizing multiple data files per database yields better disk performance. They had several points as to why this is the case, but one was the number of threads and disk spindles remain unchanged, unless you spread multiple files across multiple dedicated drives. They said people read how you can use multiple data files to improve tempdb performance and assume that also pertains to other databases. That's not the case. I wonder if this was the same reasoning Paul gave you. For that reason we are considering moving an entire database to its own RAID 10 and possibly that databases indexes to their own RAID 10.
In regards to running daily reindexing that is something we tried initially, but the development team asked us to not do this because it was interfering with their batch processing. The application is not yet production LIVE, but is being aggressively developed and deployed with a February deadline. Prior to going LIVE we plan on implementing an online reindexing process.
Just this morning I noticed several (hundreds) of the vendor's indexes contained fill factors of 25 or 30%. That would help explain the slow reindex performance and possibly some slow disk performance. I asked the vendor why they are using values so low and they said their indexes are supposed to have a 90% fill factor. Time to fix the fill factors.
Thanks, Dave
January 5, 2009 at 11:47 am
noeld (1/5/2009)
I join to the "FREQUENT REORGANIZE" idea... In my 24x7 Servers Reindexing is banned!
At some point I would expect that you have to do a rebuild to get it back in order. Even Microsoft recommends that you do a rebuild if the fragmentation climbs over 30% as it is better for performance. This is why Enterprise Edition has Online Indexing as an option, so that the impact to 24X7 servers is minimized. If you don't do index rebuilding ever, do you ever run DBCC CHECKDB against the servers, because that is going to be more invasive than spreading index rebuilds out over time.
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
January 5, 2009 at 11:56 am
DBADave (1/5/2009)
They said people read how you can use multiple data files to improve tempdb performance and assume that also pertains to other databases. That's not the case. I wonder if this was the same reasoning Paul gave you. For that reason we are considering moving an entire database to its own RAID 10 and possibly that databases indexes to their own RAID 10.
Yep, that is exactly what Paul said, in fact he blogged it as well.
In regards to running daily reindexing that is something we tried initially, but the development team asked us to not do this because it was interfering with their batch processing. The application is not yet production LIVE, but is being aggressively developed and deployed with a February deadline. Prior to going LIVE we plan on implementing an online reindexing process.
We also have a large Oracle environment here as well. One thing we do in the multi-TB data warehouses is track index rebuilds/reorganizations in a table in the DBA schema. It contains a row for every index in the database, and a nightly process updates the fragmentation levels of them if they haven't been rebuilt/reorganized in the last X days. Then it works them in reverse order of size/fragmentation/impact. The impact column is one that a DBA can set a value 1-10 to mark how important the index(s) are to actual functionality, for instance every index on the base fact table is extremely important. The other thing tracked is the start/finish time of the last rebuild, so the job manager can determine whether it can complete a rebuild within the established rebuild window and if not, it will pick another index that will fit in the remaining time window. Granted that this is a large implementation but it works really good to keep things straight.
Just this morning I noticed several (hundreds) of the vendor's indexes contained fill factors of 25 or 30%. That would help explain the slow reindex performance and possibly some slow disk performance. I asked the vendor why they are using values so low and they said their indexes are supposed to have a 90% fill factor. Time to fix the fill factors.
Thanks, Dave
Yeah, that will make more of an impact than you would think. Expect to see your growth rates decrease as well because you aren't holding 3/4th's of the space in free space from the fill factors.
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
Viewing 15 posts - 1 through 15 (of 39 total)
You must be logged in to reply to this topic. Login to reply