Andrew Kelley from Solid Quality Learning did this one, which was good.
I was worried it would be about maintenance plans, but it wasn't.
Instead it was on good real life maintenance for your server and how to
handle it. He did start out talking at the Maintenance Plans abstract
you away from what's happening and can be limiting, so be sure you do
understand what's going on, don't just blindly assume the plan wizard
knows best.
Hardware
Since the maintenance can stress SQL, it's a good idea to look at
hardware. First, memory. It's better to have too much than too little.
Most SQL Servers are supposed to be memory bound, and memory is an easy
and cheap way to boost performance.
For CPUs, more is better and multi-cores are definitely recommended. As
your data size grows and load grows, even in maintenance, having more
CPUs or cores helps. One caveat, it's recommended to turn off
hyperthreading unless you really test. Since it's one core/CPU, you
could end up with threads performing different activities, and slow
things down. I still haven't see great metrics on this, but the
consensus seems to be leave this off. So if you have an older CPU, be
sure that's the case.
MAXDOP - You can set this at the server or statement level. Which can
be handy if you are working on a system that is being used. You can
reduce the use of CPUs by your maintenance statement by setting this to
something other than 0 (use app procs).
Disks
Know your config. RAID 5 has been the standard, but it's gone out of
favor for databases. It's write intensive and people seem to be more
worried about failures. Andrew recommended R0+1 for most things, data,
logs, tempdb. R1 was for the OS, logs, and tempdb. Not sure why not for
data, perhaps because large files. R0 is not the ideal.
Beware of SAN or other storage people. Be sure your separate drives are separate physically.
DBCC CheckDB
Checks all rows, text offsets, etc. to be sure that they are correctly linked. 2005 added many more options and checks.
- data purity - values within range for data type.
- Row overflow pointers
- Service Broker checks
- XML and indexed view checks
- torn page and checksum - set this for each database for the page level.
Be sure that when you upgrade, you run checks. It is possible for data to exceed type limits and be invalid.
In 2000, you could run CHECKDB and got CHECKALLOC and CHECKTABLE. 2005
adds CHECKCATALOG. SS2K also used a table level schema lock and could
block log truncations. In 2005, it uses an internal snapshot to get a
transactionally consistent view and doesn't block the log.
Run this on master, model, and msdb. They can get corrupt as well. Not as likely, but it can happen. Run on ALL databases.
When to run? Every night it best, but you have to weigh the risks v the
time and resources you have. Run as often as you can.
How can you speed it up?
- Run physical_only to skip some checks. Does a good job of
checking integrity, so use this if you cannot run the entire check.
- Run it with TABLOCK. It runs faster, but it blocks other connections.
- Run it against a recently restored backup. This allows you to run this on another backup.
- Use no_infomsgs
File-level Fragmentation
- Normal fragmentation you might see on your workstation.
- Create your files as large as needed for needs at once. This prevents growth, which often fragments files.
- Don't let auto-grow kick in. Proactively add space when needed.
Auto growths can be expensive and pause your server for the connections
doing work. Instant file initialization helps, but requires Enterprise
Edition, permissions for the SQL Server service account, etc. NOT TRUE
FOR LOG FILES
- Dedicate the drives to SQL Server, not other apps.
- Use an OS level defragmenting tool.
Shrinking a Database
- This should almost never happen. This removes free space, which fragments files, causes a need for growth, etc.
- costly operation and FULLY LOGGED
- Keep the free space you need for maintenance activities.
- If you need to do this, use SHRINKFILE, not SHRINKDB. More
control and lets you stop after each file. It is online and it's
stoppable.
Optimizations
Why? to deal with logical fragmentation and page fill.
Logical Fragmentation - The physical order does not match the
logical order. So index on names, alphabetically, might be stored as
Allen, George, Henry, Rhonda, Ken, Tom, Billy. So the server jumps
around through the pages back and forth to follow the index. Page
splits cause this.
Page fullness - Also known as fillfactor. How much free space on
a page. Caused by data changes. Reindexing regularly it to get this to
your ideal value, not deal with corruption.
In 2000 we used
- DBCC REINDEX
- DBCC INDEXDEFRAG
- DBCC SHOWCONTIG
Now SS2K5 uses
- ALTER INDEX REBUILD
- ALTER INDEX REORGANIZE
- sys.dm_db_index_physical_Stats
The DBCCs are being deprecated and will not be in Katmai, so change your code now!
INDEX REBUILD
- does all or one index and deals with existing RI and can work online
- Adheres to fill factors and padding, updates stats, and can uses parallelism.
- Best chance of getting contiguous index.
However, there are cons.
- Locks the entire table for the entire operation if it's offline.
- One large transaction, so log space can be an issue
- Can take awhile and uses lots of resouces. Requires 1.2x size of the index.
REORGANIZE
- minimal locks, fills pages up to the fill factor
- can be stopped and restarted
- log can be backed up while this occurs
- doesn't require extra free space.
HOWEVER, the cons
- Can log several times the size of the index over time because pages can move more than once.
- Doesn't update statistics and is single threaded. Rebuilds the leaf level only and works only on one file at a time.
Which is better? It depends. Use the one that works best in each
situation. You may have servers that require REORGANIZE, but others
that can use REBUILD. Need to consider the size of tables, maintenance
window, size of tables, hardware, online requirements, etc. You don't
necessarily need to do every index on every table every night. Might
have different schedules for different databases. Check under
SHOWCONTIG (SS2K) or under the DMV (SS2K5) is an example for reindexing
based on thresholds.
Statistics
- Do not need to do this if doing a REBUILD. You don't necessarily need
to do this unless your data changes dramatically. If you have
auto-update turned on, that probably works.
Backups
What type of backups should you use? It depends.
What are your requirements? The recovery time is a big factor here. How
fast you need to recover helps determine what types of backups to take.
Full backup every night is a standard practice. Data loss is the other
factor. Less tolerance for data loss means more frequent backups. And
of course, the database size is a factor (data size, not allocated
size). The larger it is, the less choice you may have.
Do we need to verify backups? Verify does not detect corruption (unless
you use the checksum option), but verify makes sure that you can read
back what you wrote.
Need to test your backups! Ensure that you can restore them or they could be useless.
Disk or Tape? Disk first is preferred, not the same disk as the data
resides on, but it's quicker, cleaner, and available if you need a
quick restore. Move to tape for long term storage. If you use remote
drives, use UNC pathing, not mapped drives. Third party tools are a
good idea to compress the backups and save space and time.
Don't use INIT or the same file every day. It loses flexibilty and
increases the chances of problems. The maintenance plan does this for
you, but lots of code samples here at SQLServerCentral.com or elsewhere
on the web.
Watch our for backup history. Every backup gets recorded in msdb. The
maintenance plans can clean up the history, but if you don't use them,
be sure that you clean up with sp_deletebackuphistory periodically.
Scheduling - Know when things are happening. I agree with this and in
most of my environments my servers have been small enough I could
arrange things well. But I do see where something like sqlSentry is
needed if you have lots of jobs. Since maintenance is resource
intensive, be sure that you don't schedule things a the same time, like
everything at midnight. You'll make everything take longer.
Things not built in, but need to be done
- check for free disk space
- check for free space in files
- scripting jobs
- scripting dbs
- delete old backup files (or log files
- chcek for log errors
- check for job failures.