Read and Write speeds not increased with additional disks.

  • Does anybody know why I see absolutely no performance improvement when I spread my primary file group over 8 separate files on 8 separate disks, as opposed to having the primary file group all in one file on one disk.

    I have set up 2 identical databases, one spread over 8 disks and one on one disk. Each database has a table called DATA and a column called VALUE. Value is NVARCHAR(200). I have filled each table up in both databases with 20,000 rows.

    I then perform a select on each table in each database using CHECKPOINT and DBCC DROPCLEANBUFFERS to ensure I am reading from disk before each query and the execution times are identical in both databases.

    I then ran the same queries against each database using a load testing tool and the batch requests per second on each DB is identical under load.

    Surely the database with data spread over 8 disks should be FAR faster than the single file database as you have the combined reading power of 8 disks as opposed to 2??

    Also, the same is happening for write speeds. When I create the data on both databases, the time it takes is identical on both.

    If you look at this post you will see that BOL says it should be faster with multiple disks.

    Just FYI this is on an Azure virtual machine and each disk is a locally redundant data disk that I have attached to the virtual machine.

    Could somebody shed some light on this and in particular shed light on whether write speeds should increase with multiple disks or just read speeds?

    Any help would be much appreciated. Thanks.

  • william.rees.howells (3/13/2014)


    Does anybody know why I see absolutely no performance improvement when I spread my primary file group over 8 separate files on 8 separate disks, as opposed to having the primary file group all in one file on one disk.

    I have set up 2 identical databases, one spread over 8 disks and one on one disk. Each database has a table called DATA and a column called VALUE. Value is NVARCHAR(200). I have filled each table up in both databases with 20,000 rows.

    I then perform a select on each table in each database using CHECKPOINT and DBCC DROPCLEANBUFFERS to ensure I am reading from disk before each query and the execution times are identical in both databases.

    I then ran the same queries against each database using a load testing tool and the batch requests per second on each DB is identical under load.

    Surely the database with data spread over 8 disks should be FAR faster than the single file database as you have the combined reading power of 8 disks as opposed to 2??

    Also, the same is happening for write speeds. When I create the data on both databases, the time it takes is identical on both.

    If you look at this post you will see that BOL says it should be faster with multiple disks.

    Just FYI this is on an Azure virtual machine and each disk is a locally redundant data disk that I have attached to the virtual machine.

    Could somebody shed some light on this and in particular shed light on whether write speeds should increase with multiple disks or just read speeds?

    Any help would be much appreciated. Thanks.

    1) You should post questions like this to the SQL Azure forum.

    2) This could be an artifact of the way the redundancy works in Azure.

    3) Not sure that 20K rows is sufficient to test IO throughput. Try 20M rows maybe and see how that goes.

    4) The "different disks" presented in Azure could be like most of my clients' "different disks", i.e. really just sitting on top of the same underlying physical structure, in which case things can actually go SLOWER the more files you place on them due to the disk/slew/head/movement issues you force.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Hi thanks for the response.

    In this article for Azure: http://instadba.com/azure-vm-sql-server-disk-configuration/ it says the following: "For OLTP type loads (many small reads and writes) performance will scale linearly as you add more VHDs."

    If this means what I think it means is that the more disks you add the faster it should go??

  • william.rees.howells (3/13/2014)


    Does anybody know why I see absolutely no performance improvement when I spread my primary file group over 8 separate files on 8 separate disks, as opposed to having the primary file group all in one file on one disk.

    I have set up 2 identical databases, one spread over 8 disks and one on one disk. Each database has a table called DATA and a column called VALUE. Value is NVARCHAR(200). I have filled each table up in both databases with 20,000 rows.

    I then perform a select on each table in each database using CHECKPOINT and DBCC DROPCLEANBUFFERS to ensure I am reading from disk before each query and the execution times are identical in both databases.

    I then ran the same queries against each database using a load testing tool and the batch requests per second on each DB is identical under load.

    Surely the database with data spread over 8 disks should be FAR faster than the single file database as you have the combined reading power of 8 disks as opposed to 2??

    Also, the same is happening for write speeds. When I create the data on both databases, the time it takes is identical on both.

    If you look at this post you will see that BOL says it should be faster with multiple disks.

    Just FYI this is on an Azure virtual machine and each disk is a locally redundant data disk that I have attached to the virtual machine.

    Could somebody shed some light on this and in particular shed light on whether write speeds should increase with multiple disks or just read speeds?

    Any help would be much appreciated. Thanks.

    Generally the technique requires more disks in a RAID array. More spindles in an array can boost performance due to the way the disk controller is able to have all disks read\write the data in a parallel fashion.

    What is the storage setup like for the server in question?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • 8 separate individual disks. I have then split the primary file group up into 8 files and each of those 8 files gets its own individual single disk.

  • is this a virtual machine, which hypervisor platform are using?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • yes this is a virtual machine and it is a large machine (quad core 7gb ram)

  • Perry Whittle (3/13/2014)


    is this a virtual machine, which hypervisor platform are using?

    This is a VM running on Azure, which means, technically, the HyperV hypervisor although things are a little different in Azure.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • william.rees.howells (3/13/2014)


    Does anybody know why I see absolutely no performance improvement when I spread my primary file group over 8 separate files on 8 separate disks, as opposed to having the primary file group all in one file on one disk.

    ...

    Are we just talking about response time? Especially on SQL Azure, most of the response time is probably network latency for a query against a relatively small dataset like 20,000 rows. Like Kevin suggested, try loading a larger dataset, like 100,000 rows or more. You can look at Client Statistics in SSMS and also SET STATISTICS TIME ON to reveal how many miliseconds for compile, CPU, network and total execution.

    Gaining Performance Insight into Windows Azure SQL Database

    http://social.technet.microsoft.com/wiki/contents/articles/1657.gaining-performance-insight-into-windows-azure-sql-database.aspx

    Testing latency between client and SQL Azure via client statistics in SSMS

    http://blogs.solidq.com/PDoshi/Post.aspx?ID=12

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply