Blog Post

Load Testing Your Storage Subsystem with Diskspd – Part III

,

In our final post in our “Load Testing Your Storage Subsystem with Diskspd” series, we’re going to look at output from Diskspd and run some tests and interpret results. In our first post we showed how performance can vary based on access pattern and IO size. In our second post we showed how to design a test to highlight those performance characteristics and in this post we’ll execute those tests and review the results. 

First let’s walk through the output from Diskspd, for now don’t focus on the actual results. There are four major sections:

  • Test Parameters – here is the test’s parameters. Including the exact command line parameters executed. This is great for reproducing tests.
    Command Line: diskspd.exe -d15 -o1 -F1 -b60K -h -s -L -w100 C:\TEST\iotest.dat
    Input parameters:
    timespan:   1
    -------------
    duration: 15s
    warm up time: 5s
    cool down time: 0s
    measuring latency
    random seed: 0
    path: 'C:\TEST\iotest.dat'
    think time: 0ms
    burst size: 0
    software and hardware write cache disabled
    performing write test
    block size: 61440
    number of outstanding I/O operations: 1
    thread stride size: 0
    IO priority: normal
  • CPU Usage – CPU usage for the test, recall if you are not using all your bandwidth, you may want to add threads. If your CPU burn is high, you may want to back off on the number of threads.
    Results for timespan 1:
    *******************************************************************************
    actual test time:15.00s
    thread count:1
    proc count:2
    CPU |  Usage |  User  |  Kernel |  Idle
    -------------------------------------------
       0|  30.10%|   1.04%|   29.06%|  69.89%
       1|   0.10%|   0.10%|    0.00%|  99.78%
    -------------------------------------------
    avg.|  15.10%|   0.57%|   14.53%|  84.84%
  • Performance – this is the meat of the test. Here we see bandwidth measured in MB/sec and latency measured in microseconds. With SSDs and today’s super fast storage I/O subsystems, you’ll likely need this level of accuracy. This is alone beats SQLIO in my opinion. I’m not much a fan of IOPs since those numbers require that you know the size of the IO for it to have any meaning. Check out Jeremiah Peschka’s article on this here. Remember, focus on minimizing latency and maximizing I/O please refer back Part I and Part II posts in this series for details.
    Total IO
    thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
    -----------------------------------------------------------------------------------------------------
         0 |      3162378240 |        51471 |     201.04 |    3431.10 |    0.289 |     2.816 | C:\TEST\iotest.dat (20GB)
    -----------------------------------------------------------------------------------------------------
    total:        3162378240 |        51471 |     201.04 |    3431.10 |    0.289 |     2.816
    Read IO
    thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
    -----------------------------------------------------------------------------------------------------
         0 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | C:\TEST\iotest.dat (20GB)
    -----------------------------------------------------------------------------------------------------
    total:                 0 |            0 |       0.00 |       0.00 |    0.000 |       N/A
    Write IO
    thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
    -----------------------------------------------------------------------------------------------------
         0 |      3162378240 |        51471 |     201.04 |    3431.10 |    0.289 |     2.816 | C:\TEST\iotest.dat (20GB)
    -----------------------------------------------------------------------------------------------------
    total:        3162378240 |        51471 |     201.04 |    3431.10 |    0.289 |     2.816
  • Histogram – this gives a great representation of how your test did over the whole run. In this example, 99% of the time our latency was less than 0.654ms…that’s pretty super.
    %-ile |  Read (ms) | Write (ms) | Total (ms)
    ----------------------------------------------
        min |        N/A |      0.059 |      0.059
       25th |        N/A |      0.163 |      0.163
       50th |        N/A |      0.193 |      0.193
       75th |        N/A |      0.218 |      0.218
       90th |        N/A |      0.258 |      0.258
       95th |        N/A |      0.312 |      0.312
       99th |        N/A |      0.654 |      0.654
    3-nines |        N/A |     17.926 |     17.926
    4-nines |        N/A |     18.906 |     18.906
    5-nines |        N/A |    583.568 |    583.568
    6-nines |        N/A |    583.568 |    583.568
    7-nines |        N/A |    583.568 |    583.568
    8-nines |        N/A |    583.568 |    583.568 
        max |        N/A |    583.568 |    583.568

Impact of I/O Access Patterns

  • Random

    diskspd.exe -d15 -o32 -t2 -b64K -h -r -L -w0 C:\TEST\iotest.dat

    Read IO
    thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
    -----------------------------------------------------------------------------------------------------
         0 |     16066543616 |       245156 |    1021.49 |   16343.84 |    1.896 |     0.286 | C:\TEST\iotest.dat (20GB)
         1 |     16231759872 |       247677 |    1031.99 |   16511.91 |    1.877 |     0.207 | C:\TEST\iotest.dat (20GB)
    -----------------------------------------------------------------------------------------------------
    total:       32298303488 |       492833 |    2053.48 |   32855.75 |    1.886 |     0.250

    In this test you can see the that there is high throughput and very low latency. This disk is a PCIe attached SSD, so it performs well with a random IO access pattern.

  • Sequential

    diskspd.exe -d15 -o32 -t2 -b64K -h -s -L -w0 C:\TEST\iotest.dat

    Read IO
    thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
    -----------------------------------------------------------------------------------------------------
         0 |     16094724096 |       245586 |    1022.21 |   16355.35 |    1.895 |     0.260 | C:\TEST\iotest.dat (20GB)
         1 |     16263544832 |       248162 |    1032.93 |   16526.91 |    1.875 |     0.185 | C:\TEST\iotest.dat (20GB)
    -----------------------------------------------------------------------------------------------------
    total:       32358268928 |       493748 |    2055.14 |   32882.26 |    1.885 |     0.225

    In this test you can see that the sequential I/O pattern yields a similar performance profile to the random IO test on the SSD. Recall that an SSD does not have to move a disk head or rotate a platter. The access latency to any location on the drive has the same latency cost.  

Impact of I/O sizes

  • Tranaction log simulation  

    diskspd.exe -d15 -o1 -t1 -b60K -h -s -L -w100 C:\TEST\iotest.dat

    Write IO
    thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
    -----------------------------------------------------------------------------------------------------
         0 |      3162378240 |        51471 |     201.04 |    3431.10 |    0.289 |     2.816 | C:\TEST\iotest.dat (20GB)
    -----------------------------------------------------------------------------------------------------
    total:        3162378240 |        51471 |     201.04 |    3431.10 |    0.289 |     2.816

    This test measures access latency of single thread with a very small data transfer, as you can see latency is very low at 0.289. This is expected on a low latency device such as a local attached SSD.
     

  • Backup operation simulation

    diskspd.exe -d15 -o32 -t4 -b512K -h -s -L -w0 C:\TEST\iotest.dat

    Read IO
    thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
    -----------------------------------------------------------------------------------------------------
         0 |      8552185856 |        16312 |     543.17 |    1086.33 |   29.434 |    26.063 | C:\TEST\iotest.dat (20GB)
         1 |      8846311424 |        16873 |     561.85 |    1123.69 |   28.501 |    25.373 | C:\TEST\iotest.dat (20GB)
         2 |      8771338240 |        16730 |     557.09 |    1114.17 |   28.777 |    25.582 | C:\TEST\iotest.dat (20GB)
         3 |      8876720128 |        16931 |     563.78 |    1127.56 |   28.440 |    25.353 | C:\TEST\iotest.dat (20GB)
    -----------------------------------------------------------------------------------------------------
    total:       35046555648 |        66846 |    2225.88 |    4451.76 |   28.783 |    25.593

    And finally, our test simulating reading data for a backup. The larger I/Os have a higher latency but also yield a higher transfer rate at 2,225MB/sec.

In this series of post we introduced you into some theory on how drives access data, we presented tests on how to explore the performance profile of your disk subsystem and reviewed Diskspd output for those tests. This should give you the tools and ideas you need to load test your disk subsystem and ensure your SQL Servers will perform well when you put them into production!

The post Load Testing Your Storage Subsystem with Diskspd – Part III appeared first on Centino Systems Blog.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating