August 4, 2003 at 4:21 pm
Hi
Well, I'm in the exciting position where I'm buying a new SQL cluster and hopefully some disk storage to match!
What I was wondering is given a choice of two storage subsystems, which would you choose and why? Assume that the SAN costs 50% more (it's a cheap SAN!)
Both options give a similar amount of usable storage (800GB or so) once you take into account the higher utilisation ratio of the SAN and the difference in RAID levels.
1. Direct-attach SCSI (IBM EXP300)
39 x 36GB 10,000RPM disks in three external disk units
RAID-5 for data and online backups
RAID-1 for log
2 hot spares defined for each array (IBM's rules for clustered arrays)
No controller cache because it doesn't work with clusters!
2. Entry-level SAN (IBM FAStT600)
14 x 73GB drives, 10,000RPM
2Gb/sec fibre-channel connection
All RAID-5 disk protected, one hot spare
256MB controller cache per controller
Options for 2 controllers to provide extra redundancy
Any replies greatly appreciated!
--
Si Chan
Database Administrator
August 4, 2003 at 4:40 pm
Well, I personally go the SAN route much more growth and setting up the cluster is much much easier. There are very few people that support scsi attached clusters anymore. I've set up both types and had fewer problems going the SAN route. But, I haven't dealt with the IBM solution only Dell,HP, and Compaq.
Wes
August 4, 2003 at 4:46 pm
Direct attach is cheaper, but I've got an EMC SAN and very happy with it. Its hard to compare apples to apples, the cache on the SAN really makes the difference. You still need sufficient available IO, once you saturate the cache you're IO limited again. Hopefully never happens, but it will. I still configure logs on RAID1, see no reason to do any differently just because its on a SAN (until someone convinces me different!).
The why is harder to justify, especially for dollars. What sold us is the CX400 can handle 4 enclosures, if we outgrow it we can easily convert in place to the CX600. Nice to have options like snapshot and replication. For us it was very high reliability and the ability to treat storage as a pool (as much as you can with current technology).
SAN's are starting to support ATA and SATA drives now, giving the ability to add cheaper space that isnt as important for performance - file storage.
73G drives are the minimum, I'd look at larger - enclosures cost money, get the biggest you can. I also like the 15k rpm drives, every little bit helps.
Andy
August 5, 2003 at 4:10 am
Thanks for the reply, Andy. The enclosure I'm looking at holds 14 drives, so 73GB drives will give us around 800GB usable capacity in RAID-5, assuming a single physical array (should be at least two arrays, I know!)
quote:
73G drives are the minimum, I'd look at larger - enclosures cost money, get the biggest you can. I also like the 15k rpm drives, every little bit helps.
You say that bigger drives are better, but what is the effect on performance of reduced spindle count? Or does this not matter quite as much when you're talking about hundreds of megabytes of cache?
--
Si Chan
Database Administrator
August 5, 2003 at 5:05 am
quote:
Direct attach is cheaper, but I've got an EMC SAN and very happy with it.
although I have no personal experience with SAN's, thought I'd mentioned a more or less funny anecdote.
I just had lunch break with our network admin. they were complaining about their EMC SAN and the third disk dying within ~4 weeks
Cheers,
Frank
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
August 5, 2003 at 6:16 am
Im not advocating less spindles, just more space. Remember that disks get slower as they get fuller and that you'll probably want to back up to disk. Cache does offset spindle count to some degree, but I advocate being conservative.
We've lost one disk since our SAN has been up...a year? Hot spare handled it fine. We're using the 15k rpm drives which run hotter, but still you can expect a failure. Main thing is to make sure you know when the first one fails.
Andy
August 5, 2003 at 11:41 am
I've got a used emc 3700 with 128 disk in it. It's slow that aside it is a trooper. No disk failures or hardware failures in any way.
Wes
August 6, 2003 at 4:18 am
Si Chan,
I would opt for the Fastt600.
First of all that one is a Fiber Channel (FC) attached device, and as I see it will be a 2 GB FC interface. That one is considerably faster than SCSI (Aat max. 160Mb/s). And a FastT600 has a limited amount of cache too.
Do not forget the fact if you have a FC disk subsystem you could start with a simple SAN (Storage Area Network). For that you will need a FC swith. Preferably 2 sou you will be redundant.
For the performance point of view I would use the 15K rpm disk instead of 10 K rpm.
Also if you don't need too many disk space then I would opt for smaller and more disks. So you can distribute your databases on dedicated physical disks.
Also having more physical disk you could use RAID10 which is overal the fastest RAID config (of cours RAID0 is faster but you have no redundancy!)
And later one if you are running out of disk space you can add some extension modules to the FastT600.
As a background I am running Sharks (IBM ESS), Compaq MA8000 and SCSI disk arrays as well for my clusters.
Definitly the SCSI one has the poorest performance.
Gabor
Bye
Gabor
August 6, 2003 at 4:29 am
Hi
So things have progressed a bit, and one of ideas we have is a FAStT600 with 8 x 146GB 10,000RPM drives.
Now, considering we need as a minimum:
(i) At least two arrays to separate data and logs
(ii) One hot spare
The only configuration I can reasonably get from this is:
SQL Logs 2 x 146GB RAID 1 = 146GB
SQL Data 5 x 146GB RAID 5 = 730GB
Hot Spare 1 x 146GB
However, I am concerned about the performance of the log array, which has effectively only 1 spindle for writing, although this will obviously be less of an issue if the controller does write caching.
Alternatives might include having 14 73GB drives instead (again, at 10,000RPM), or using the SCSI solution (39 x 36GB disks).
Comments?
--
Si Chan
Database Administrator
August 6, 2003 at 9:28 am
Sorry that this is off-topic, but I have a RAID question. When you use RAID5, isn't some percentage of the total space allocated to the checksum or whatever it's called? In your example of 5 disks, isn't 1/5 of the space for redundancy? That would yield only about 584 GB of useable space for the data array.
August 6, 2003 at 9:40 am
quote:
When you use RAID5, isn't some percentage of the total space allocated to the checksum or whatever it's called?
Yes! So, actually the usable space is only 584GB, not 730GB.
Another option might be to make two RAID-5 arrays, one for data and one for log, by adding an extra disk to the log array.
This would improve the performance of the log array (1.5 effective spindles vs 1 effective spindle) at the cost of an extra drive.
This arrangement still theoretically underperforms the 14 x 73GB drive configuration though.
--
Si Chan
Database Administrator
August 6, 2003 at 10:13 am
We've got a fwe SANS here, used for various things. What we've seen.
3 run fairly flawlessly, one has had numerous failures. To be fair, this is the oldest one and uses a different technology, loop as opposed to switched in the fiber. However, it's getting replaced and when it does go down, it affects a bunch of servers.
Not completely sold on this as better than RAID, but as you get larger and larger disk requirements, it does seem to be a better solution.
Steve Jones
August 6, 2003 at 2:42 pm
You may want to stay away from RAID 5 for your transaction logs. You want fast writes for logs and RAID 5 weakness is writing (parity writes). RAID 1 or 10 is best for transaction logs. FYI - If this is a cluster, MS also recommends having the quorum on it's own RAID 1 disks.
August 7, 2003 at 4:51 am
I was surprised to read you can't use controller cache on clusters. Write-back cache makes a huge difference in our database environment (as with anything yours may vary). But we are not clustered.
Our SAN is an old Dell version, and performance in general sucked vs our expectations, but it was MUCH better than non-or-low-cache controllers. We've since just bought big-cache, high-end direct attach controllers and been very happy. But we are not clustering.
On our SAN we have never seen any obvious disk speed limitation. I say that because we started with Raid-5 for data and Raid-1 for index and temp/log space, and saw zero difference. We get a pretty flat max load, which looks to us like we're hitting either an I/O rate or controller bandwidth limitation (which might not vary with raid type). So before just assuming Raid-1 or 0+1 is faster than Raid-5, test your particular subsystem. You might not need to waste the space.
PS. I KNOW that Raid-1 and 0+1 is "faster" in terms of the disks, but the ONLY thing you are really concerned about is the performance of the entire subsystem from the CPU through its PCI (or other) bus, interconnect to the storage processors, SP speed, and its interconnect and bus structure to disk. Whatever is the weakest link is your determining factor, and in our DELL (made by Data General) SAN it is not the disks.
August 7, 2003 at 4:57 am
quote:
I was surprised to read you can't use controller cache on clusters.
This limitation only applies where the cache is physically in the host RAID controller. In this case, if there is a failover and there is still unwritten data in cache, this data is lost, so write caching must be disabled to prevent this happening.
A SAN has cache independent of the hosts, so this is safe to use.
quote:
Whatever is the weakest link is your determining factor, and in our DELL (made by Data General) SAN it is not the disks.
This sounds promising! I'm hoping that I'm not replacing our infrastructure with something that is slower, that's all!
--
Si Chan
Database Administrator
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply