Question about writes to multiple files

  • Hi all,

    I am testing out a blank database created over two physical files on two separate disks with one table called data which has one column called values nvarchar(max).

    I filled the table up with a whole load of data and ran a select * against it. If I run Permon at the same time I can see that the read load has been spread over multiple disks as each of these disks is getting read from in parallel. If I create the same database on a single file and run the same select * again it takes much longer, proving that the read load has been distributed across multiple disks.

    Now moving onto writes, this is where the confusion lies. I understand that SQL server fills files evenly until they need growing, after which it will then fill files individually until they are full in a round robin fashion unless you have trace 1117 turned on. What I don't understand is why the writes aren't distributed out whilst it is filling these file groups.

    I ran an continual insert into my table with go 1000000 to monitor how the files are being filled up. I monitored where SQL server was physically placing the files as they were being inserted by running the following query:

    ;WITH CTE AS

    (SELECT

    sys.fn_PhysLocFormatter (%%physloc%%) col1,

    RIGHT(LEFT(sys.fn_PhysLocFormatter (%%physloc%%),2),1) AS [Physical RID],

    DATAID

    FROM DATA)

    SELECT

    [Physical RID],

    count([Physical RID])

    from cte

    group by [Physical RID]

    order by [Physical RID]

    I could see that it would a thousand or so records into file 1, then a thousand or so into file 2, then a thousand or so into file 1 etc etc. In another words it would hit one disk, then another disk, then back to disk one to fill the file evenly. Is there any way to make SQL Server distribute the writes out in parallel so that both disks are writing in tandem?

    By the looks of it, multiple disks only scale reads, as with writes only one disk is ever written to at once which is annoying. Can somebody confirm that this is correct or if there is a way to harness the write power of multiple disks?

    Any help would be great,

    Thanks

  • Re inserts.

    Once over a few pages (8 or 24, can't recall), SQL allocates dedicated extents to tables. So 8 contiguous pages. So what's happening probably is 8 pages from one file being filled, 8 from the other, repeat.

    If you want writes in parallel, you'll probably need multiple filegroups and different tables in different filegroups. Then writes to table 1 in filegroup 1 can be parallel with table 2 in filegroup 2.

    Probably. Not behaviour I've played much with.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi thanks for your response.

    Could I spread each of those file groups over the 2 disks in the same way I currently have my primary spread over the 2 disks to retain the benefits of the additional read speed as well as gaining the additional write speed?

  • Sure. File in filegroup 1 on disk 1, file in filegroup 2 on disk 2. However a table (or at least a partition) must be in a single filegroup.

    this isn't something you do without a lot of testing and careful design.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi sorry,

    I meant more along the lines of

    FILEGROUP 1 (2 files) -> File 1 Disk 1, File 2 Disk 2

    FILEGROUP 2 (2 files) -> File 1 Disk 1, File 2 Disk 2

    If that makes sense. I ask because when you have one table spread over multiple files the read speeds increase dramatically. I want to retain this as well as allowing parallel writes if that makes sense?

  • Sorry I am being stupid. I have run perfmon on both disks measuring disc bytes write and monitored the activity.

    My endless insert statement doesn't cause any disk activity until the internal checkpoint runs. At this point all the data is written to both discs at the same time. So it looks like the writes do scale out over the discs as well.

    I was under the impression that an insert would be written straight to disk but I guess it doesn't need to do this for safety, as the log file already has the record of the transaction written to disk in case of disaster.

  • Only writes to the log are synchronous. Writes to the data file happen later.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply