August 8, 2008 at 3:34 am
Hi there,
EDIT: first off; if this is the wrong place please let me know and I'll move elsewhere. I also know you'll all want to say "it depends". 😉
I have noticed that two of our major systems (OLTP and Datawarehouse) has severe I/O bottlenecks and I'd like to make some suggestions to my colleagues. We will be installing an extension to our SAN system in the next 4 weeks or so, so this is a good time to bring the ideas to the table.
I have been searching the web and cam across the two terms BAARF and SAME.
BAARF is a group of people who HATE RAID 5 and all varations of it.www.baarf.com They seem to support the idea of RAID 01 or RAID 10 and mention the term SAME. http://www.miracleas.com/BAARF/oow2000_same.pdf
SAME is an oracle backed idea, saying "use RAID 10, but use just one big RAID 10 setup". It seems a bit radical (but strangely appealing and sensible) to me and was wondering what anyone here thinks to it.
Although the idea is based on oracle and how it works, the ideas would be transferable to MS-SQL Server too.
Could anyone provide me with their thoughts on this? Is the premise of "throw everything on a huge RAID10 and forget about fine-grained partitioning etc" a good thing, or is it really better to carefully select and partition data off into smaller subsets?
I know I don't provide specifics to how our systems are setup and what workloads they have. I am just looking for a general consensus on how much time/energy is/should be put into the consideration for RAID storage for OLTP and Datawarehouse applications, or at least a "quiet you, this SAME thing is a load of b******s, now go back to sleep and let the real men do their work".
Well, thanks for any input you provide.
Regards,
GermanDBA
Regards,
WilliamD
August 8, 2008 at 9:08 am
For most orgaizations, the determination is cost more than anything. It has less to do with a "technology" decision and more of a "business" decision.
Consider the costs of increasing historical data that's rarely queried being stored on the fastest disks on the most expensive SAN technology. Now, if your tables were partitioned, you could potentially store that data on "slower" disks on a cheaper SAN, reducing the spindle counts and the spindle depth for the older data. Yes its still queryable, just slower to read. But its a whole lot cheaper to store, especially when you're storing a ton of data.
You could potentially implement two different SAN technologies to store your data to better manage the storage costs and implement more of a storage "solution" that's cost effective for your business. In the storage management circles the terminology is as follows...
Online : This is your "hot" data or the data that's used more frequently.
Near-Line : This is the "middle ground" for data that's still queried, but not as frequently. For example, your records management policy may dictate you keep records for seven years.
Off-Line : This is the "cold" storage area. The records may not appear in the index anymore, but could potentially be "re-mounted" at a later date.
Usually cost drives the solution that best fits your business model. Yes I know hard drives are cheap, but when you consider the costs of someone managing all those spinning disks, the operational costs to keep them happy, faster disks that fit into expensive SANs DO cost a lot more, and you're storage requirements will never decrease... All those factors make the price tag pretty high.
August 8, 2008 at 9:46 am
SAME is certainly very easy to implement.
I wouldn't use it, for the simple reason that having data and log files on the same array limits your recovery options in the case of disk failure.
My favorite set up is RAID-1/10/01 for data (OLTP), separate RAID-1/10/01 for logs (OLTP), and separate again for tempdb. If the OS and executables can also be separated from these, that's good, but that doesn't matter as much, since the OS is pretty much going to be RAM resident once you boot up.
For OLAP, I've found RAID-5/6 perfectly adequate, since writes tend to be large and infrequent. Raid 10 can be used for OLAP, and can speed reads up somewhat, but it's kind of expensive for a very small gain. If you have a separate SQL Server instance/box for OLAP than for OLTP, you'll still want tempdb on RAID-1/10/01, since that's all about lots of writes, and it will be used for temp tables, CTEs, etc.
So, yeah, SAME would work, but I'd tend away from it for those reasons.
On the other hand, splitting multiple databases across multiple disk arrays manually is a lot of work for a small gain, and using the SAME principles to that extent makes a lot of sense. I just wouldn't keep tempdb and logs on the same spindles as data files, if I can avoid it.
Oh, and backups definitely need to be on different disks. That kind of goes without saying, but in my firm belief in over-communicating, I'll say it anyway.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply