The goal of the storage performance analysis is to identify the performance characteristics of a target storage device. To determine the baseline of each system’s configuration, the free disk benchmarking utility from Microsoft called SQLIO can be used to determine the overall performance throughput curve for the storage arrays.
Microsoft provides SQLIO, my favorite storage benchmarking tool, for free at Microsoft.com. This tool is used to provide a full analysis of how the storage device responds to varying degrees of load. I’m here to show you how I use it.
Download, install this utility, and then copy the sqlio.exe executable from the installation directory into a new directory, usually named c:\sqlio. This helps ensure that the utility is in the local current path. Alternatively, you can just put the installation location into the system path, but the other simply makes the folder portable between servers.
Now, what do we do with this? Admittedly, it’s not the easiest tool to use, as the command line options are quite numerous.
The common command line parameters are (grabbed from Technet):
A single SQLIO test runs as follows.
These test performed a single one second sequential read test with four worker threads, eight operations per thread, and used a param.txt specifications file to set where the workload test file(s) is located and how large the test file is. In this case, the test file is located at c:\sqlio.test.dat and is 1GB. I generally recommend much larger runtimes and files, and the size can be the average size of your database that you are looking to approximate.
But, what if you want to stress test a storage device under varying degrees of load or test types? I want to see how the storage performs under a varying degree of stress.
In the past, I’ve used a batch file to perform a series of tests back-to-back, but it’s inefficient to change the runtimes on each line as needed for your testing. Niels Grove-Rasmussen developed a much more efficient set of PowerShell scripts that help automate and extract the results from the command-line utility. Let’s learn how to use them!
To get started, please copy and paste the two scripts from the blog site http://sqladm.blogspot.com.au/2011/04/sqlio-batch.html into two PowerShell scripts and save them in your target directory.
All SQLIO tests in his script are performed with a parameter sweep that uses varying types of parameters to test the following combinations of performance patterns:
- Block size
- Read and write
- Sequential and random
- Varying numbers of threads (1-128, but I usually fix this to the number of CPUs on the server)
- Varying numbers of operations per thread (1-128)
First, we must create the target workload file(s). The SQLIO Batch scripts do not allow you to specify the size of the workload file, but we can work around this quite easily.
In the working directory, create a file called ‘param.txt’. Edit it to resemble the following.
e:\sqlio.test.dat 4 0x0 50000
Change the drive letter (and set a folder if required) in front of the workload file sqlio.test.dat to the target drive(s) that you intend to test. Do not change the file name. Next, set the second parameter to the number of CPUs on your server. Finally, edit the last parameter to set the final parameter to size the workload file. This value is the size in MB of the workload file. Size this according to your normal production database file. Verify that you have enough free space on that volume so you do not fill up the volume.
Next, from a command line, execute the following command.
Sqlio.exe –kR –t4 –s1 –o16 –fsequential –b8 –BN –LS –Fparam.txt
The purpose of this command is to create the workload file with the appropriately sized workload. Change the -t parameter to the number of CPUs on your server (the same as specified in the param.txt file). Change the -b parameter to the block size that you are testing – 64KB for SQL Server data files, 1MB for testing SQL Server backup stream performance, 8KB for log file simulations, etc. If possible, ensure that not only is your NTFS file system formatted to this same block size, but that the underlying SAN is configured for this block size. Some SANs are sensitive to block size mismatches, and when tuned properly can result in substantial performance improvements.
Disregard the output from this screen, as it is just to create the file and not actually stress test it. Perform this command for each drive you wish to test.
Now, edit the first of the two PowerShell scripts that you created (usually named sqliobatch.ps1).
In this script are a number of variables in the header. Configure the target drive(s) accordingly. Set the number of seconds per test. You must determine your window you have available to perform the tests.
Calculate the number of tests to be executed by multiplying the following variable counts:
- I/O Kind
- Number of threads
- Test factor (random or sequential)
- I/O outstanding
- Block sizes
For the test values above, the count comes out to 192 tests. Between each test is a 60 second pause (approximately line 36 of the script) to allow I/O idling for clearing controller cache, etc. You can change this value if needed. Based on these counts and timings, adjust your test timing accordingly. Longer tests are better, if at all possible.
Next, execute the test in a PowerShell window.
Note: Do not execute this test during production hours without getting express permission from the storage administrator. These tests are intensive and can cause performance degradation of the storage unit and all other dependent systems. Even running it after hours can cause disruption, so please work with your storage administrator to get an appropriate window to run these tests.
By default, the script now names the output file SQLIO_Result.(datestamp).txt. If you open the file, you can see the raw output from each of these tests.
Next, execute the results parsing script with the following command.
./SQLIO-parse.ps1 -ResultFilename SQLIO_Result.(datestamp).txt
It will save the output file to a CSV named SQLIO_Parse.csv. This file can now be used to review the storage performance metrics and analyzed for trends. A few minutes with a spreadsheet can output some tremendously insightful charts showing you how your storage subsystem performs and responds under load.
Now, with any storage test, the values to pay attention to are latency, IOPs, and then throughput – in that order. As the load on the storage goes up, the latencies usually go up as well. Aggregate load can also cause stress. Overlay these test results with your Perfmon traces and see where you stand on a steady-state basis. Are you maxing out your SAN? How much headroom do you currently possess when it comes to storage performance and not just capacity? Awareness of current state and future projections help you plan for growth!