December 3, 2003 at 7:59 am
Given the following constraints, what would be the best configuration for drives for a new SQL Server in a clustered environment?
I have 10 drives.
Here is the proposed configuration:
2 mirrored drives for the quorum
2 mirrored drives for the transaction logs
2 mirrored drives for the temp db
4 Raid 10 drives for the data
Would I gain anything or lose anything going this route instead of taking the 2 mirrored drives for the tempdb and adding them to the Raid 10 configuration?
I am configuring this for another company, and I do not have the option spending any more money.
December 9, 2003 at 12:00 pm
This was removed by the editor as SPAM
December 9, 2003 at 12:30 pm
The only real answer is "test", but my two cents --
The biggest issue in how this works is what they are connected to. I've done a LOT of raid controllers where the controller was such that Raid 0+1 and Raid-5 performed equally on write performance. That goes against all the standard advice, but in practice you have limitations on I/O processor speed, CPU -> controller speed, impact of writeback cache, etc. So the Raid 0+1 might or might not buy you anything relative to Raid-5. It actually MIGHT be slower for read, since Raid-5 will spread it over more spindles for any given read request.
Also, you might want to look at the raid controllers in general -- how they will break up among the disks (if there is more than one) may be quite important. Lots of writeback cache is good for temp and transaction logs (and maybe some filegroups if you write to them a lot).
How are the drives controlled?
December 9, 2003 at 3:09 pm
quote:
The only real answer is "test", but my two cents --The biggest issue in how this works is what they are connected to. I've done a LOT of raid controllers where the controller was such that Raid 0+1 and Raid-5 performed equally on write performance. That goes against all the standard advice, but in practice you have limitations on I/O processor speed, CPU -> controller speed, impact of writeback cache, etc. So the Raid 0+1 might or might not buy you anything relative to Raid-5.
If you're bottlenecked on I/O (typical of writes in an OLTP environment), then you've misconfigured something if RAID 5 is as fast as RAID 0+1 with everything else the same...
quote:
It actually MIGHT be slower for read, since Raid-5 will spread it over more spindles for any given read request.
No; good array controllers perform "split seeks" across the mirrored pairs, so, with typical random multi-user I/O, a four-drive RAID 0+1 will be reading from more data blocks than will a four drive RAID 5.
quote:
Also, you might want to look at the raid controllers in general -- how they will break up among the disks (if there is more than one) may be quite important. Lots of writeback cache is good for temp and transaction logs (and maybe some filegroups if you write to them a lot).
In tuning an OLTP application, you will usually find some percentage of cache for write (25-75%) will give the best performance. I usually start at 50% on the controller with the data array anyway, and on Bradley's system I assume there's only one array controller, so 50% write may well be best with the tran log array on the same controller.
Unless there is a significant amount of index maintenance using SORT_IN_TEMPDB (more common with DSS than OLTP), then it's almost certainly best to just put tempdb on the same (six drive?) array as the other database's data. That way you get larger stripes for everything, which will help with random I/O.
Don't use filegroups unless you've got more data than can fit on one logical array, which is certainly not true in this case.
--Jonathan
--Jonathan
December 10, 2003 at 5:10 am
quote:
If you're bottlenecked on I/O (typical of writes in an OLTP environment), then you've misconfigured something if RAID 5 is as fast as RAID 0+1 with everything else the same...
Well it depends on what you mean by "misconfigured". Example - I have a Dell Powervault 650/630 SAN (I think datageneral build them). I configured a quarter of it as 0+1 and the rest as raid-5. I've done a lot of work trying to improve performance on it.
Our i/o's tend to saturate with SQL Server about 400-600 IO/s. It saturates at that point whether it is read or write, raid-5 or raid-0+1. My theory is that either the interconnects are too slow (my bet), or the storage processors are too slow. But in either case in a high I/O environment we tend to get this nice level graph of operations per second. My conclusion is that Raid-5 updates are faster than some other limitation somewhere between CPU and disk, so my using Raid 0+1 on that box is a waste of space. The CPU is a fast Dell 8540.
I've also seen cases a bit more complex where slow CPU's in particular applications are simply not able to keep a disk subsystem busy. That's less likely in todays' hardware and SQL Server, but if you aren't generating more I/O (in particular more O) than Raid-5 will handle without queuing up, then Raid-0+1 is just costing money. I am not saying that they are as fast, but rather if raid-5 is not your limitation given your typical workload then it buys you nothing. For example if in most of your workload you don't fill the writeback cache where the I/O queues then raid-5 is just as fast effectively.
That's why i started by saying "test".
quote:
No; good array controllers perform "split seeks" across the mirrored pairs, so, with typical random multi-user I/O, a four-drive RAID 0+1 will be reading from more data blocks than will a four drive RAID 5.
Good point. i don't know what I was thinking of. Thanks.
Incidentally, in re-reading this, another thought. BradleyB suggested everything would be mirrored, meaning that 2x the usable space is needed. If you are going to do that, and if there is only one raid controller and path in use, I'd argue that you are better off with a single raid 0+1 set.
Almost by definition if you break it into multiple raid sets in various pairs, you are going to have more hot spots than if you had the entire workload spread. The normal reasons for breaking things up differently is either because you have specific hardware differences between them (separate controllers or paths), have a limitation to work around (e.g. max disks in a raid-5 set), or want to tune one set differently than another (e.g. raid type or some other parameter). Or maybe o/s limitations in how large of a volume you can deal with.
If you end up limited by the physical disks (as opposed to controller or other limits), the more evenly you can spread the workload the better, and one big long stripe set is by definition the most even.
December 10, 2003 at 7:56 am
quote:
quote:
If you're bottlenecked on I/O (typical of writes in an OLTP environment), then you've misconfigured something if RAID 5 is as fast as RAID 0+1 with everything else the same...Well it depends on what you mean by "misconfigured". Example - I have a Dell Powervault 650/630 SAN (I think datageneral build them). I configured a quarter of it as 0+1 and the rest as raid-5. I've done a lot of work trying to improve performance on it.
Our i/o's tend to saturate with SQL Server about 400-600 IO/s. It saturates at that point whether it is read or write, raid-5 or raid-0+1. My theory is that either the interconnects are too slow (my bet), or the storage processors are too slow. But in either case in a high I/O environment we tend to get this nice level graph of operations per second. My conclusion is that Raid-5 updates are faster than some other limitation somewhere between CPU and disk, so my using Raid 0+1 on that box is a waste of space. The CPU is a fast Dell 8540.
That certainly meets my definition of "misconfigured."
quote:
I've also seen cases a bit more complex where slow CPU's in particular applications are simply not able to keep a disk subsystem busy. That's less likely in todays' hardware and SQL Server, but if you aren't generating more I/O (in particular more O) than Raid-5 will handle without queuing up, then Raid-0+1 is just costing money. I am not saying that they are as fast, but rather if raid-5 is not your limitation given your typical workload then it buys you nothing. For example if in most of your workload you don't fill the writeback cache where the I/O queues then raid-5 is just as fast effectively.
Please note that Bradley only has six drives for all his data... If he's not bottlenecked by disk access, there's something seriously amiss. RAID 0+1 also provides superior fault tolerance to RAID 5, so it's not "just costing money" even if we ignore the performance difference. This is a clustered system, so we know that high-availability is desired.
quote:
That's why i started by saying "test".
I doubt that Bradley has the resources to properly test this. I felt compelled to post because your advice was, at best, off topic, given his question.
quote:
Incidentally, in re-reading this, another thought. BradleyB suggested everything would be mirrored, meaning that 2x the usable space is needed. If you are going to do that, and if there is only one raid controller and path in use, I'd argue that you are better off with a single raid 0+1 set.Almost by definition if you break it into multiple raid sets in various pairs, you are going to have more hot spots than if you had the entire workload spread. The normal reasons for breaking things up differently is either because you have specific hardware differences between them (separate controllers or paths), have a limitation to work around (e.g. max disks in a raid-5 set), or want to tune one set differently than another (e.g. raid type or some other parameter). Or maybe o/s limitations in how large of a volume you can deal with.
If you end up limited by the physical disks (as opposed to controller or other limits), the more evenly you can spread the workload the better, and one big long stripe set is by definition the most even.
No. Tran logs in a small non-replicated OLTP system belong on their own physical RAID 1 pair. There's a noticeable performance hit if you instead mix their sequential I/O into the random I/O of the database (users will actually experience "hiccups" across the entire system when checkpoints are written). We even put the tran logs from separate databases on their own dedicated RAID 1 arrays when optimizing larger systems. Logs should also be on physically different disks from the data for disaster recovery reasons. The quorum should also obviously be on a separate array. This means your advice to use "a single raid 0+1 set" is essentially the same advice I gave and you are apparently trying to disagree with.
--Jonathan
--Jonathan
December 10, 2003 at 6:49 pm
Thanks. I appreciate the feedback and Education. Can I get more?
I'd welcome your advice on our setup if interested, and maybe the discussion will have interest for Bradley also. If not, I apologize in advance for hijacking his topic. This is a server we are buying right now, it's running today on a smaller server.
4 reasonably busy databases, sizes roughly 150G, 20G, 12G, 12G, plus a few other less busy and small.
The two largest are the busiest. The largest is read only, the 20G is probably 90% read, the others probably 99% read.
Buying a Compaq DL740, putting 4 processors now (2.8G/2M) and 32G memory, 6402 controllers (max cache) and two shelves of 14 15K drives each, plus two other 36G drives to mirror for system disk.
We had planned to put system and pagefile on the mirror (it's also on a weaker raid controller). The current system shows the C drive almost completely idle.
We had been thinking of just doing a Raid-5 set for the rest of the space, one on each controller, and spreading the data around.
From your description that sounds bad -- we should mirror a couple of drives for log files. Would you put all the logs on one mirror set? I'd have to look, but I think we would probably be OK on one 36G mirror - most of these are set for truncate on checkpoint.
If I need 2 or 3 mirror sets for logs alone, we're going to need to buy more disks to get enough space. Or I could put a couple more drives on the embedded controller; it is slower, but it is also relatively idle.
While we are on the subject -- what's your take on stripe chunk size. We've been always doing it as equal or a multiple of the drive cluster size (and always multiples of 8K) under the assumption then that allocation clusters fit neatly in a chunk and reduced having small I/O cause a read on two disks because an 8K block was split. Are we on track there?
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply