Files per Filegroup depends on # of CPU?

  • Hello all,

    is there a way to improve SQL Servers IO Performance by saying:

    # of files per filegroup = # of CPUs in the system?

    Is that correct? If so, is it true for every type of file in SQL Server?

    Thanks a lot!

    JM

  • sqlfriend (1/29/2009)


    # of files per filegroup = # of CPUs in the system?

    No. Not at all. That recommendation is give for TempDB and only for TempDB and only if you're seeing blocking on the allocation pages within that database. It is not recommended for normal databases at all.

    For normal databases, have no more that 1/2 the number of files as you have CPUs total. In addition, just having additional files will not help IO. They have to be on separate physical drives before you'll see a large gain in IO performance, and even then that's only if the system was bottlenecked on IOs

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you, Gail!

    To clarify: Does "have no more that 1/2 the number of files as you have CPUs total" refer to "per FILEGROUP" or to "no more then 1/2 the number of files of CPU total per DATABASE"?

  • Per database.

    Again, only really useful if you have IO problems and the file are going on separate physical drives

    Why are you thinking 1 file per cpu per filegroup? What was the reason for that thought?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi,

    this database is for a Data Warehouse system typically involving a lot of "bulk io" and queries of that type. Idea is that more then one file per filegroup could make use of the fact that the system has more then one cpu (in the current case 2xquad core) - am I wrong?

    JM

  • sqlfriend (1/30/2009)


    Idea is that more then one file per filegroup could make use of the fact that the system has more then one cpu (in the current case 2xquad core) - am I wrong?

    Yes. It's an old urban legend that SQL uses one thread per file. It is wrong. Google SQL urban legends and you should find a post by the PSS engineers on this.

    The main reasons to add files or filegroups are:

    1) IO bottleneck on the drives. split the DB into multiple files or filegroups to balance the IO over multiple disks

    2) Contention on the allocation pages - typically tempDB, can also happen in user databases is there's a lot (and I mean a LOT) of new data been added every second

    For a datawarehouse, the recommendation is much less than 0.25-0.5 times the no of cores. Check out the IO guide at sqlcat.com

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I understand, thank you. Then asking the other way round: If there is an existing system (having e.g. tables and indexes in different filegroups and dimension tables separated from fact-tables) is there a disadvantage of keeping this structure (e.g. regarding query performance) or would it make sense to move it to the ratio you mentioned (0.25-0.5 the # of Cores)? The system doesn't use partitioning.

  • Depends. Are we talking 4 or 5 files, or 4 or 5 hundred files?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Its about 25 Files separating a few big fact tables from smaller dimension tables + separating data from indexes.

  • Separate filegroups, or all in primary?

    If you have lots of filegroups will small tables, I'd probably try merging them and dropping the empty filegroups, on dev (of course) and testing to make sure that nothing's hindered.

    That's probably the most files I've seen in a database. How big a DB are we talking about here? Are all 25 of those files on separate physical drives?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi, just got an update: It's 14 filegroups with one file each. They are all on the same SAN/LUN based on several physical drives. By the way: Does it make sense to separate data and the accompanying indexes at all?

  • sqlfriend (2/9/2009)


    Hi, just got an update: It's 14 filegroups with one file each. They are all on the same SAN/LUN based on several physical drives.

    If they are all on the same physical array, then you are gaining absolutely nothing by having the separate files. They're using the same IO path and sharing the same spindles.

    Leave it, merge it, up to you. Performance should be the same either way.

    By the way: Does it make sense to separate data and the accompanying indexes at all?

    Maybe. See all the comments that I've already made about splitting files apart.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I understand. Thanks a lot, Gail!

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply