storage issue (I think)

  • For the last couple of days I’ve been testing our storage with SQLIO in order to find out what is the best way to configure our disks.

    The storage consists of a DS5100 IBM, 8gb HAD and the sql server (which is a VM).

    I’ve got a raid 10 of 8 disks. It’s fibre channel disks.

    Nothing out of the ordinary.

    Then I started doing some SQL tests. I had the san man give me a 30gb chunk in the form of 1 lun attached to my box and started hacking away. The 30GB is out of the same 8 disk raid 10.

    What I wanted to test, I wanted to see the aligned VS non aligned performance.

    I also wanted to play with the segment size on the san VS formatting size in windows.

    I thus have 2 drives which I compare. Both from the same raid , 1 being used for production and 1 new.

    The new one I can toggle between 64k and 32k(default offset and size in windows.)

    sqlio -kR -s300 -frandom -o32 -b8 -LS

    sqlio -kR -s300 -frandom -o32 -b16 -LS

    sqlio -kR -s300 -frandom -o32 -b32 -LS

    sqlio -kR -s300 -frandom -o32 -b64 -LS

    sqlio -kR -s300 -frandom -o32 -b128 -LS

    sqlio -kR -s300 -frandom -o32 -b256 -LS

    1- My tests on the new drive was great, on avg I got between 500-700mb read at a latency sub 5 ms. (Except for the 128,256). The IOPS ranging from 40-60k

    Then I ran the tests on my production drive, this was shocking.

    On avg I get 100mb/s reads at latencies going up to 59ms. The iops never exceeding 2000.

    First try, I thought it might be the fact that the new drive is on a different controller , so I flipped the drive to both be on the same controller.

    The first test got slightly lower values meaning that the controller does have an influence but less than 5% .

    So next I thought may it’s becase it’s a production drive. (The drive is used for the mart build which runs in the evening so in the day it’s pretty quiet) So I decided to stop SQL.

    The only things on the drives are ldfs and tempdb.

    Then I run the tests again.

    The slower drives were slightly faster reaching up to 200mb avg and 4k iops, latency pretty much the same.

    I’ve been thinking it’s fragmentation but SQLIO creates a new file so this shouldn’t have an impact of fragmentation ? (or should it)

    I kept on testing. I decided to copy the production data to the new drive to fill it a bit. The tests ran the same and I still got fast results as per point 1. It was slightly slower but again maybe 5%.

    Just a recap.

    Now my new drive has production data on it and the old drive is blank. I tested both and results still the same. Then I noticed the slow drive is a dynamic disk.

    So I converted it to basic and reformatted it with the correct offsets and 64k size in windows. Voila, the drive was fast. The IOPS where in the 50k and the reads around 700mb/s,

    In order to proof my theory I reformatted the same drive with the misaligned offset and default size in windows. I also converted the drive back to dynamic drive.

    In essence I am now back to my slow drive as it was before the format making it a basic drive. So I ran my tests again and low and beho… nope. The drive speed is still fast.

    The reads are on avg 500mb/s with 20k iops. I assume it’s not the 700 because of the offsets but as for the original slow 100mb avg this is gone.

    Again, a clean drive makes me think of fragmentation.

    I’ve consistently tested this over and over and I get the same results.

    As far as I know SQLIO won’t be affected by a fragmented system. If it is affected by a fragmented system, I’m not aware of a defrag on a SAN level this is the only thing I can come up with.

    I have considered the HBA queues to be a bottleneck but I ran the tests separately and at different times. I can’t think that this could be cache. As it’s the one drive consistently slower.

    Cheers

    Jannie

  • I think the fragmentation is a very good candidate.

    Even when writing a new file, if the space is not contiguous it will be written in a fragmented state which will degrade performance.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Another factor could be sectors that were getting tagged as bad until you formatted the drive. Sometimes a format will correct that issue.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • what size is the file you are using?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply