FileGroups

Question

Post reply

FileGroups

recycled

SSChasing Mays

Points: 639
More actions
October 16, 2003 at 10:48 am

#66148

Does each file in a filegroup open a new thread for r/w or does the fact that multiple files in a filegroup are on different drives open up a new thread?

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

shahgols SSCrazy Eights Points: 9874 More actions · Answer 1

Could someone please help me with this, I need to set up a server today, asap. Thank you!

Allen Cui-55137 SSC Guru Points: 51650 More actions · Answer 2

If your database is very large and very busy, data has be spreaded into multiple data files, SQL Server will create multiple threads (one per physical file) to access the data. if each file were on its own separate disk or disk array, the performance would even be greater.

Of course there is a point where the additional threads aren't of much use when you max out the server's I/O.

dbadick SSC Journeyman Points: 82 More actions · Answer 3

I am wondering how large is very large? What is the rule of thumb to go to multiple files? Should we consider multiple files if we are on Raid 5 devices with 2 procs?

Thanks.

Jonathan SSC-Insane Points: 20427 More actions · Answer 4

quote:
I am wondering how large is very large? What is the rule of thumb to go to multiple files? Should we consider multiple files if we are on Raid 5 devices with 2 procs?
Thanks.

IMHO the use of multiple filegroups should be very rare. As it's easy enough to create a 3TB array, you seldom need them because there is not enough space for the data in a single array. The other valid reasons for filegroups are when you are distributing read-only data for attachment or when you need to support some complex backup or restore scenario.

It is a false economy to stripe the data by using files on different drives rather than using an array; even an NT logical array will perform better, although you should buy an array controller and let the hardware do it.

It's still possible to find recommendations for placing different tables and/or indexes on separate physical arrays, but just using one large array (with all the data striped across all the drives) will almost always be be faster overall because all tables and indexes will benefit (with no location tuning needed), and there will be no overhead from SQL Server's proportional fill.

E.g., if you have ten drives you could create two physical arrays with five drives or one array with ten drives. Let's use RAID 0 (the fastest, but no fault tolerance) for this example. If you put the critical table on one of the five-drive arrays and its nonclustered indexes on the other five-drive array, then a query using one of those indexes will be using all ten drives, taking advantage of reading simultaneously. But if you use just one file on a ten drive array, all ten drives will also be used simultaneously. For a query using the clustered index (and all noncovered queries eventually use a clustered index if there is one), that index (whose leaf level is the table itself) will only be spread across five drives in the first case, but will utilize twice that many spindles in the second case.

Multiple filegroups also use more processor and storage overhead and are more difficult to administer. I would consider using filegroups only if a single OLTP database will be more than a terabyte in size and is predicted to be disk-bound. In that case, I'd want multiple high-end array controllers for the data in addition to the dedicated controllers for executables, tran log, and tempdb.

--Jonathan