database physical design

  • we have a 'performance expert' in working for one of our more independent development areas and one of his recommendations is to have the no of files per filegroup = the number of physical cpu cores, EVEN WHEN THESE ARE ALL ON THE SAME DRIVE. He claims his testing shows a 20% increase in throughput doing this.

    This goes against my experience and expectations so I would be interested in any input anyone has on this subject.

    I doubt this theory even more having seen he has set up multiple log files as well, and I know thats wrong, plus the database he is using for testing has over 100 files and 20+ filegroups, and a database like that would never go into production whilst I have a say in it.

    If you have any experience of multiple files on one disk please post with you views.

    ---------------------------------------------------------------------

  • Just saw a reference to this very thing in SQL Server Magazine that might be of interest.

    http://blogs.msdn.com/psssql/archive/2007/02/21/sql-server-urban-legends-discussed.aspx

  • Todd,

    thanks for the quick reply. The article would seem to confirm what I believed, apart from tempdb, multiple files per filegroup give no performance benefit if on the same drive.

    So I wonder why he claims 20% improvement.

    Any other takers?

    ---------------------------------------------------------------------

  • george sibbald (9/29/2008)


    we have a 'performance expert' in working for one of our more independent development areas and one of his recommendations is to have the no of files per filegroup = the number of physical cpu cores, EVEN WHEN THESE ARE ALL ON THE SAME DRIVE. He claims his testing shows a 20% increase in throughput doing this.

    That is specifically for TempDB, and often will only show real perf improvements if there is existing blocking on the allocation structures in TempDB. It is good practice to have more than 1 tempDB file. I usually start at a number of files = 1/2 the CPUs, split further if I see blocking and move the files onto separate drives if I see an IO bottleneck on the TempDB drive.

    So I wonder why he claims 20% improvement.

    Ask him to set up a test to prove it.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • thanks Gail. I will be asking him for proof because I don't want this set up going into production.

    you ever come across an application database with this many filegroups and files?

    george

    ---------------------------------------------------------------------

  • Not one of that size. I believe the usual recommendation for user databases is a number of files = no more than 1/4-1/2 number of cpus. You definitely don't want multiple log file, they make admin more complex and won't gain you anything. Logs files are used one at a time.

    The biggest I've seen is a 1.2 TB DB with 6 filegroups (and 6 files). Primary with just the most important of the app's tables, one for heavily read tables, one for heavily written, 2 for indexes and 1 for blobs.

    iirc, the PSS guys have written a couple more related blog posts. I'll check later.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • That is specifically for TempDB, and often will only show real perf improvements if there is existing blocking on the allocation structures in TempDB. It is good practice to have more than 1 tempDB file. I usually start at a number of files = 1/2 the CPUs, split further if I see blocking and move the files onto separate drives if I see an IO bottleneck on the TempDB drive.

    I just inherited a server with multiple tempdb files. Is there still a benefit if they're on the same drive ?

  • Only in terms of reducing blocking on the allocation pages of TempDB.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • george sibbald (9/30/2008)


    thanks Gail. I will be asking him for proof because I don't want this set up going into production.

    In fact, ask him for a copy of the test protocols so that you can try it yourself. Even better, post them here so that we can try to reproduce them. I would love a chance to analyze that.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung (9/30/2008)


    george sibbald (9/30/2008)


    thanks Gail. I will be asking him for proof because I don't want this set up going into production.

    In fact, ask him for a copy of the test protocols so that you can try it yourself. Even better, post them here so that we can try to reproduce them. I would love a chance to analyze that.

    will have a go, the guy has gone all defensive already.

    as a complication, the files are on a raid 5 san array.

    Whats your initial take on the physical design, same as mine?

    ---------------------------------------------------------------------

  • Yes, raid 5 is counter-indicated. Best practice is raid 10.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung (10/1/2008)


    Yes, raid 5 is counter-indicated. Best practice is raid 10.

    sorry barry, my fault, I meant the dataabase design, so many filgroups and files on the same drive.

    ---------------------------------------------------------------------

  • I once saw a performance improvement with and OLTP database on RAID-5 when the files were split in two and some of the most commonly used indexes were moved into the separate file. Same RAID array, but slight performance improvement by splitting tables (clustered indexes) away from non-clustered indexes.

    Can't give a percentage increase on that one, just remember that it did improve. Might have been luck, since the split may have moved some of those indexes onto different platters than the tables, just because of the way RAID-5 works.

    Other than that, I've not seen an improvement from file splitting unless it was specifically to separate disks, or was specifically regarding tempdb.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Two very important points:

    1) having multiple files for the log is just plain bad practice. Do not do that.

    2) You said multiple files on the same drive in the OP, but then stated later (I think it was you anyway) that it was a RAID 5 drive on a SAN. These two items are NOT equivalent. A single PHYSICAL drive is one spindle, and it can easily be LESS performing to have lots of files on that one drive due to the head having to thrash back and forth trying to keep up with multiple asynchronous IO requests. However, a RAID 5 SAN volume could have 5, 10 or more PHYSICAL drives behind it, in which case multiple files may well lead to improved IO performance.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply