September 29, 2008 at 3:49 pm
we have a 'performance expert' in working for one of our more independent development areas and one of his recommendations is to have the no of files per filegroup = the number of physical cpu cores, EVEN WHEN THESE ARE ALL ON THE SAME DRIVE. He claims his testing shows a 20% increase in throughput doing this.
This goes against my experience and expectations so I would be interested in any input anyone has on this subject.
I doubt this theory even more having seen he has set up multiple log files as well, and I know thats wrong, plus the database he is using for testing has over 100 files and 20+ filegroups, and a database like that would never go into production whilst I have a say in it.
If you have any experience of multiple files on one disk please post with you views.
---------------------------------------------------------------------
September 29, 2008 at 3:56 pm
Just saw a reference to this very thing in SQL Server Magazine that might be of interest.
http://blogs.msdn.com/psssql/archive/2007/02/21/sql-server-urban-legends-discussed.aspx
September 29, 2008 at 4:27 pm
Todd,
thanks for the quick reply. The article would seem to confirm what I believed, apart from tempdb, multiple files per filegroup give no performance benefit if on the same drive.
So I wonder why he claims 20% improvement.
Any other takers?
---------------------------------------------------------------------
September 30, 2008 at 3:57 am
george sibbald (9/29/2008)
we have a 'performance expert' in working for one of our more independent development areas and one of his recommendations is to have the no of files per filegroup = the number of physical cpu cores, EVEN WHEN THESE ARE ALL ON THE SAME DRIVE. He claims his testing shows a 20% increase in throughput doing this.
That is specifically for TempDB, and often will only show real perf improvements if there is existing blocking on the allocation structures in TempDB. It is good practice to have more than 1 tempDB file. I usually start at a number of files = 1/2 the CPUs, split further if I see blocking and move the files onto separate drives if I see an IO bottleneck on the TempDB drive.
So I wonder why he claims 20% improvement.
Ask him to set up a test to prove it.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 30, 2008 at 4:09 am
thanks Gail. I will be asking him for proof because I don't want this set up going into production.
you ever come across an application database with this many filegroups and files?
george
---------------------------------------------------------------------
September 30, 2008 at 4:29 am
Not one of that size. I believe the usual recommendation for user databases is a number of files = no more than 1/4-1/2 number of cpus. You definitely don't want multiple log file, they make admin more complex and won't gain you anything. Logs files are used one at a time.
The biggest I've seen is a 1.2 TB DB with 6 filegroups (and 6 files). Primary with just the most important of the app's tables, one for heavily read tables, one for heavily written, 2 for indexes and 1 for blobs.
iirc, the PSS guys have written a couple more related blog posts. I'll check later.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 30, 2008 at 1:47 pm
That is specifically for TempDB, and often will only show real perf improvements if there is existing blocking on the allocation structures in TempDB. It is good practice to have more than 1 tempDB file. I usually start at a number of files = 1/2 the CPUs, split further if I see blocking and move the files onto separate drives if I see an IO bottleneck on the TempDB drive.
I just inherited a server with multiple tempdb files. Is there still a benefit if they're on the same drive ?
September 30, 2008 at 2:09 pm
Only in terms of reducing blocking on the allocation pages of TempDB.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 30, 2008 at 4:36 pm
george sibbald (9/30/2008)
thanks Gail. I will be asking him for proof because I don't want this set up going into production.
In fact, ask him for a copy of the test protocols so that you can try it yourself. Even better, post them here so that we can try to reproduce them. I would love a chance to analyze that.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
October 1, 2008 at 3:18 am
rbarryyoung (9/30/2008)
george sibbald (9/30/2008)
thanks Gail. I will be asking him for proof because I don't want this set up going into production.In fact, ask him for a copy of the test protocols so that you can try it yourself. Even better, post them here so that we can try to reproduce them. I would love a chance to analyze that.
will have a go, the guy has gone all defensive already.
as a complication, the files are on a raid 5 san array.
Whats your initial take on the physical design, same as mine?
---------------------------------------------------------------------
October 1, 2008 at 7:30 am
Yes, raid 5 is counter-indicated. Best practice is raid 10.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
October 1, 2008 at 10:43 am
rbarryyoung (10/1/2008)
Yes, raid 5 is counter-indicated. Best practice is raid 10.
sorry barry, my fault, I meant the dataabase design, so many filgroups and files on the same drive.
---------------------------------------------------------------------
October 1, 2008 at 12:40 pm
I once saw a performance improvement with and OLTP database on RAID-5 when the files were split in two and some of the most commonly used indexes were moved into the separate file. Same RAID array, but slight performance improvement by splitting tables (clustered indexes) away from non-clustered indexes.
Can't give a percentage increase on that one, just remember that it did improve. Might have been luck, since the split may have moved some of those indexes onto different platters than the tables, just because of the way RAID-5 works.
Other than that, I've not seen an improvement from file splitting unless it was specifically to separate disks, or was specifically regarding tempdb.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
October 2, 2008 at 10:02 am
Two very important points:
1) having multiple files for the log is just plain bad practice. Do not do that.
2) You said multiple files on the same drive in the OP, but then stated later (I think it was you anyway) that it was a RAID 5 drive on a SAN. These two items are NOT equivalent. A single PHYSICAL drive is one spindle, and it can easily be LESS performing to have lots of files on that one drive due to the head having to thrash back and forth trying to keep up with multiple asynchronous IO requests. However, a RAID 5 SAN volume could have 5, 10 or more PHYSICAL drives behind it, in which case multiple files may well lead to improved IO performance.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply