December 23, 2008 at 12:43 pm
Lots of great articles on the intertubez about disk alignment and RAID configurations but I haven't found an answer (or a good way to test, and yes I know about SQLIO) for a simple scenario:
Suppose I have 4 local physical disks available to me and I'm going to create an OLTP DB (for the sake of arguing let's say we're 50\50 on reads and writes, or otherwise average usage). Here are two scenarios that I would consider:
1) Two RAID 1 drives. Create the DB with two data files, i.e. one data file on each drive
2) One RAID 10 drive. Create the DB with one single data file
In the RAID 1 scenario (as I understand it) SQL will round robin writes between the two data files, thus creating a software equivalent of striping. RAID 10, on the other hand, handles the striping at a hardware level. Those differences aside, I haven't found a good technical explanation - or numbers to back it - for what's happening under the covers that would make me believe one option is better than the other.
So which of these two scenarios is more ideal and why?
December 23, 2008 at 1:06 pm
I would argue Raid 10 for one reason. You're not 100% sure of the usage and balance among the drives and with 2 R1s, you can run out of space on one, have space on the other. With one large R10, it gets handled and you get to use all the space.
Other than that, I'm not sure there's a great technical argument for R1 v R10.
December 23, 2008 at 2:13 pm
You should worry about your Log file before you add a second data file.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
December 23, 2008 at 2:19 pm
RBarryYoung (12/23/2008)
You should worry about your Log file before you add a second data file.
Ummm....thanks....but kinda not the point of the post. I'm looking for technical reasons why a single data file on RAID 10 is better than two data files on two RAID 1's.
But FWIW I'll stick my log files on a different drive than my data files altogether. :hehe:
December 23, 2008 at 9:52 pm
I doubt that there is much difference from a performance standpoint ... the raid 1 scenario would presumably require a tiny increase in system resources. Raid 10 would appear to be a little easier to manage in as much as you would be dealing with one file rather than two. If I remember correctly, fragmentation statistics aren't accurate for multiple files either.
The above all assumes that the 2 files are across one filegroup (as that would use the proportional fill algorithm) - if one were to split the database into two filegroups (one file per filegroup) then there are other potential advantages from the raid 1 scenario .... filegroup backups, partitioning etc etc.
December 24, 2008 at 2:19 am
Another possible implication of two raid 1s would be if they were the same filegroup, with a massive table that was frequently being scanned. If I remember rightly in that situation (table scan, table on more than 1 file group) SQL Server will initiate multiple threads (one per file) and run the scans in parallel.
Mike
December 24, 2008 at 8:19 am
kendal.vandyke (12/23/2008)
I'm looking for technical reasons why a single data file on RAID 10 is better than two data files on two RAID 1's.
Ah, I see, I misunderstood the intent of your question. Sorry...
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
December 24, 2008 at 8:30 am
December 24, 2008 at 12:49 pm
i would say the RAID 10 array is best as it offers the best performance and fault tolerance combined into one package. The downside of RAID 10 is the disk cost (no of disks required)
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
December 24, 2008 at 3:37 pm
i would say the RAID 10 array is best as it offers the best performance and fault tolerance combined into one package. The downside of RAID 10 is the disk cost (no of disks required)
Maybe the question was misunderstood. I proposed two scenarios for how to configure 4 disks to hold data. A single RAID 10 with 4 disks is just as fault tolerant as two RAID 1 drives - each can lose 1 disk per pair. Likewise the disk cost is the same in the question I asked.
As for performance, I'm looking for something solid to show that RAID 10 would be better than RAID 1 or vice versa. I was really hoping someone knew enough about what's going on under the covers (e.g. IO paths, threads, etc.) to make it clear.
December 25, 2008 at 3:33 am
kendal.vandyke (12/24/2008)
Maybe the question was misunderstood.
not at all.
RAID 10 will generally offer a performance boost over RAID1 due to the striping across mirrored sets. As SQL server writes to the data files are a fairly random affair this suites sql server just fine. As RbarryYoung pointed out the log file is more important as it is suffers sustained serial writes and so performance requirements are different here. It all depends on how many disks and controllers are used in the config too. Also do you have any baseline for the expected reads\writes, if not it may be worth obtaining some?
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
December 27, 2008 at 5:03 pm
Perry Whittle (12/25/2008)
kendal.vandyke (12/24/2008)
Maybe the question was misunderstood.not at all.
RAID 10 will generally offer a performance boost over RAID1 due to the striping across mirrored sets. As SQL server writes to the data files are a fairly random affair this suites sql server just fine. As RbarryYoung pointed out the log file is more important as it is suffers sustained serial writes and so performance requirements are different here. It all depends on how many disks and controllers are used in the config too. Also do you have any baseline for the expected reads\writes, if not it may be worth obtaining some?
Assuming that the system was running, how would you go about systematically getting baselines for reads/writes?
December 28, 2008 at 5:30 am
jlp3630 (12/27/2008)
Assuming that the system was running, how would you go about systematically getting baselines for reads/writes?
as you say, depends if the system is already live. If your scoping new disks for a current server then you should already have baseline data for disk I\O anyway. If your scoping for a complete new system then the database vendor would be a good start (if a 3rd party app) or try it in a test lab if you have it.
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
December 28, 2008 at 8:57 am
Perry Whittle (12/28/2008)
jlp3630 (12/27/2008)
Assuming that the system was running, how would you go about systematically getting baselines for reads/writes?as you say, depends if the system is already live. If your scoping new disks for a current server then you should already have baseline data for disk I\O anyway. If your scoping for a complete new system then the database vendor would be a good start (if a 3rd party app) or try it in a test lab if you have it.
Perry,
Telling him that he should already have baseline metrics didn't really answer his question.
jlp3630, you said that your system was already running...here are some ways you can gather baseline metrics:
1) Set up a performance monitor counter log that writes to a SQL database. You can write ad-hoc queries to look at the raw data.
2) Use a tool like SQLH2 to gather perfmon metrics on a schedule, then use the reports that come with the tool to look at the numbers the tool collected. You can also use your own queries directly against the SQLH2 repository if you don't like the pre-canned reports.
3) Use a 3rd party tool like SQLSentry Performance Advisor or Idera SQL Diagnostic Manager[/url]
#1 & #2 are free but don't have the "polish" of the 3rd party tools.
December 28, 2008 at 9:55 am
I have a 20 gb database. I have 6 sas disk
Configuration is:
two works in span (Raid 1)
another two are mirror of first two
another two are global hot spares.
I tested it and works like charm. when disk go out, controller takes hot spare and automatically rebuilds array.
Viewing 15 posts - 1 through 15 (of 56 total)
You must be logged in to reply to this topic. Login to reply