Introduction
The Index Wizard is a handy tool shipped as part of the SQL 7 Profiler that
can analyze a set of SQL queries and suggest index changes that could improve
their performance. Here I will take a look at how to get the best results out of
it.
How The Wizard Works
You need to point the Wizard at a copy of your database, "feed" the
Index Wizard with a "Workload" file (which is a set of representative
SQL commands that are used against the database you wish to tune) and just let
it go. The Wizard examines the queries in the workload file and tries to
determine if indexing changes would improve their performance.
Building A Workload File
If you are developing a new database, then what these
"representative" commands are may well be guesswork, and if SQL
queries are generated on the fly by web clients for instance, there may be
infinite variations of SQL queries that can be run – in this case you have to
make an educated guess. With other applications all your queries may be
controlled through a small set of stored procedures, you just need to fabricate
calls to these SPs with some "representative" parameters.
With live systems the problem of getting a workload file pretty much goes
away, SQL profilers tracing function will create one for you – just point
Profiler at the live database and let it record some actual user queries. Don’t
forget to add the database ID in the filters section (simply run "SELECT
DB_ID()" in the relevant database to find out it's ID) to exclude queries
for other databases – you can also cut out all the "connection" info
in order to keep you trace file small.
Running The Wizard
As a matter of preference I always run profiler on a remote machine and store
the output in a .trc file on my remote machine, minimizing load on the SQL
Server. This is particularly important in live environments as you want to add
as little overhead on a live server as possible.
Once I have my workload file I need to run the Index Tuning wizard against
the database. Again, to minimize impact on live servers, I always do this on a
development machine with an up to date copy of any live database I have run my
workload against. Index Wizard works on a "Logical IO" basis, which is
not really machine dependant, so it does not matter if the development machine’s
hardware spec is different to the live one.
The Wizard will recommend a number of indices to you, many of which will be
in place already, and can generate a script to create new indexes for you, which
should be saved to disk for later us. If you require, the Wizard can generate
any new indexes for you on the spot, or schedule the job for a quieter time.
Dealing With The Output
Amongst other things, the Wizard will show you the "Top 100"
queries that if it thinks will benefit most from implementing it’s
recommendations, and let’s you save them to a SQL file. I suggest you do save
the queries, as they can come in handy later on.
It will also ask you if you want to put the new indices in place straight
away – I usually say "Yes" to this, because I am usually running the
Wizard on a development database. Once the new indexes are created, I can now
compare "Head to Head" the copy database (with the new indexes)
against the live database (or another copy of it)
This is where the "most improved" queries that we saved earlier
come in handy – I compare the query plan and for the queries on each copy of
the database, or run the queries in each database to get a feel for actual time
improvements (Needless to say, don’t run any updating queries against your
live database)
Once I have convinced myself that the Wizard’s recommendations are
worthwhile I can then plan out their implementation in the Live environment. I
don't always assume that they are worthwhile - I will tell you why later.
Hints And Tips
It’s tempting to get as large as possible a workload file to make your
testing more "representative", but I suggest you do not do this for
the following reasons:
- The wizard will only process up to 32000 odd example queries anyway
- You will get multiple queries that are very similar, and these can fill up
the "top 100" improved queries that the Wizard shows you. Use a
smaller workload file and you will get a better distribution of improved
queries in this list, allowing you to double-check more of the Wizard’s
recommendations
- If you do have a large workload file, you can set the maximum number of
queries the Wizard will consider to a more manageable number using the
options under the "Advanced" button.
If you really want to process against larger data sets, I suggest you split
them up into smaller sections and aggregate the results.
Remember that an additional index can speed up many operations, but may also
slow down any insert, update and delete queries (because the new indes also
needs to be kept up to date) You have to consider this before adding an index.
If the wizard says that it’s suggested index will speed up retrieval by a huge
amount, but you mostly insert to this table, it might not be such a good idea to
add the index after all.
Remember also that the usage profile of your database may change throughout
the day – maybe you add data to the tables throughout the working day but
produce batch reports during the evenings. More indexes will favour batch
reporting at the expense of data collection, but I would say that it’s often
more important to reduce transaction times during the data collection stage.
During the day your users are sitting waiting for things to happen, but you have
the whole night to produce those reports.
The wizard defaults to considering indices that can have up to 16 columns. It’s
generally recommended to keep indexes as narrow as possible though, and I
usually reduce this setting under the "Advanced" tab.
If the wizard suggests an index that will shave a tiny fraction of a query
that already seems to go blindingly fast, don’t discount it simply because you
cannot actually see the improvement – when you have multiple concurrent users
these tiny savings really can add up.
One Final Thought
If adding an index is the cure, is the absence of an index the real problem?
It could be the query itself that needs tuning, or it could be that your
database’s structure is not as efficient as it could be. I have often found
that the Wizard’s results have revealed a more fundamental flaw than a missing
index, and you should always consider this possibility.
About the author
Neil Boyle is an independent SQL Server consultant working out of London,
England. Neil's free SQL Server guide is available on-line at http://www.impetus-sql.co.uk