December 30, 2008 at 4:43 am
RBarryYoung (12/29/2008)
No, RAID 1 has twice the Read IO throughput of a single disk.
How can that be, Barry?
RAID 1 can only have two drives, and one is dedicated to mirror the other.
So for reading purposes this is the same as a single drive, is it not?
/Kenneth
December 30, 2008 at 6:10 am
Kenneth Wilhelmsson (12/30/2008)
RBarryYoung (12/29/2008)
No, RAID 1 has twice the Read IO throughput of a single disk.How can that be, Barry?
RAID 1 can only have two drives, and one is dedicated to mirror the other.
So for reading purposes this is the same as a single drive, is it not?
/Kenneth
To the OS the mirrored pair is presented as a single disk, but under the covers the drives in the mirrored pair hold the same data so the array controller can read different pages off each disk at the same time, thus doubling the read throughput.
December 30, 2008 at 6:59 am
Right. What Kendall said. 🙂
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
December 30, 2008 at 7:37 am
I don't want to be rude here guys but talk about the blind leading the blind!! There are a number of excellent white papers available on tuning SQL2005 including the disk subsystem ( from Microsoft ) and there's the excellent sql 2000 performance tuning handbook from ms press which goes into much of what you're incorrectly quoting. ( yes I know it's 2000 but most is still relevant )
with 4 disks you'd best put them to one raid, it's really not enough spindles to make any difference. I've been performing performance benchmarks on various hardware, mostly sans, and although I've not published some results yet I've not been able to produce any significant difference in performance using multiple files and/or filegroups, in some case multiple files produced a performance drop.
There is no "overhead" with raid 1 or raid 10 - it's just you loose 50% of your disks, unlike raid 5 or 6 where there is a serious write overhead.
Multiple files and filegroups are best suited to large databases where there are technical reasons for working with the data that way; each time you add files and filegroups you increase the complexity of the database management.
http://sqlblogcasts.com/blogs/grumpyolddba/
http://www.microsoft.com/technet/prodtechnol/sql/2005/physdbstor.mspx
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
December 30, 2008 at 8:55 am
There are a number of excellent white papers available on tuning SQL2005 including the disk subsystem ( from Microsoft )
I've read the whitepaper but whitepapers tend to be general in nature so I asked a specific question about a setup that I'd bet a lot of people are running.
with 4 disks you'd best put them to one raid, it's really not enough spindles to make any difference. I've been performing performance benchmarks on various hardware, mostly sans, and although I've not published some results yet I've not been able to produce any significant difference in performance using multiple files and/or filegroups, in some case multiple files produced a performance drop.
I've been running some tests myself, and although I haven't published anything yet I did come across something very interesting that no one has mentioned. Long story short, for the test scenario I described (4 local disks) if you use RAID 10 with a 32KB alignment you will get worse performance on 64KB random reads and 8KB random writes than if you use two RAID 1's with a 32KB alignment. Surprisingly, using a 64KB offset on the RAID 10 config brought it up to equivalent performance as two RAID 1's with a 32 KB offset. The only caveat is that in all cases RAID 10 had a lower average latency than RAID 1. Now this was specific to my hardware so mileage may vary. I will try to have something published soon.
There is no "overhead" with raid 1 or raid 10 - it's just you loose 50% of your disks, unlike raid 5 or 6 where there is a serious write overhead.
You're right, the wrong choice of words was used to convey the disk cost involved.
December 30, 2008 at 9:12 am
According to this article http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/08/17/948.aspx
SQL Server will write equal amount of data into data files of the same filegroup, provided that amount of free
space in them is same. Otherwise more data is written to the file with more free space. I could not find an
exact correlation between amount of free space/amount of data written. In real world situation files are going to be different.
Hence, the load on the disk hosting the file with more space will be bigger.
Now, how does RAID0 (RAID 10 in this scenario) work? For reads and writes that are larger than the stripe
size, such as video playback, the disks will be seeking to the same position on each disk, so the seek time of
the array will be the same as that of a single drive. For reads and writes that are smaller than the stripe size,
such as database access, the drives will be able to seek independently. This is provided that stripe size is not
less than the page size (8kb currently;SQL Server engine works on the page level).
WRITING DATA[/u]
When using RAID 10, we have almost ideal situation where data written to one single data file is being automatically
written to two disks at the same time, giving us twice the performance of a single disk. With two datafiles on
separate RAID1 disks this can only be achieved by having equal amount of free space in each file. (I am talking
about files part of the same filegroup, otherwise its not going to work at all)
conclusion[/i]
Although we can reach almost same performance in both configuration, RAID 10 will require less administrative overhead.
READING DATA[/u]
Reading will virtually be same in both configurations. However, for two RAID1 we need to make sure that files have same amount
of free space, otherwise more data will end up on one of the drives.
conclusion[/i]
As for writing, RAID 10 will require less effort.
Considering all of the above, two RAID 1 disk will perform worse.
December 30, 2008 at 10:22 am
interesting links, so basically if it all went horribly wrong and 1 file filled up for some reason all the data would be written to just 1 remaining data file. That would make this pointless in the first place wouldnt it?
as i have already said i'd go with the RAID10 array straight off. Its my understanding (rightly or wrongly) that having multiple files and\or filegroups is only recommended for situations where table or index storage differs. i.e. separating indexes from tables (clustered exceptions obviously) to make rebuilding more efficient or separating large read only tables from the database so your not backing up a 10GB static table every day, etc
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
January 9, 2009 at 1:37 am
There's a bit of bad information in this thread - and as of yet no-one has answered the OP's specific question. Unfortunately I don't have a complete answer either - it's a broad question (eg "it depends"), but here are some points that may help:
[nitpick]
No, RAID 1 has twice the Read IO throughput of a single disk...
This is not necessarily true - though most RAID controllers I've ever used (mainly Dell PERCs of various generations) do pull data from both halves of the mirror if possible, there are controllers out there that don't. Let's assume for argument's sake though that the OP is talking about a modern hardware RAID controller that's capable of both RAID1 and RAID10, and wants to know whether there are any advantages to either config with the same hardware.
[/nitpick]
Leaving aside partitioned tables there is no special performance advantage to splitting a DB into multiple files on different volumes for normal IO (and we're also not talking about multiple filegroups and moving "hot" tables/indexes into different filegroups from the others). However backup and restore operations do benefit from having your DB in multiple files on different volumes as the reader/writer threads are allocated per device (drive letter or mountpoint). The best article on this I've found is from PSS here, with supporting info
So if backup/restore performance are a key criteria then there is an advantage to the OP's option #1 (RAID1). But as Michael Valentine Jones points out you can easily do this with the RAID10 option too.
Regards,
Jacob
January 9, 2009 at 1:51 pm
It's a total nightmare with regard to disk performance, however let's get a few facts set. There is no write gain on a raid 1, it's the same as a single disk, some controllers can read from both sides of the mirror thus improving read performance, but it's unlikely you could measure this and it's probably unlikely with san or nas storage.
4 spindles set in a raid 10 will have far superior write performance then if you set them to a raid 5, but will have slightly less read performance.
As far as multiple files are concerned ( not filegroups ) I've been testing on DAS and 3 different SANs using 16 processor cores and 8 files in the primary filegroup. The tests consist of populating 4 tables each with 1 millions rows of data; 34GB of data including a number of indexes and a text column. Second test has been to update every row in a random order using 2 cursors per table, it's about 30 million seperate io ops. I haven't been able to record any measurable better performance for any test, and in some cases the multiple files have degraded performance. On one san I have aligned partitions with 64kb block size, 128kb stripe, raid 10 dedicated spindles; I also have some 4kb block size non aligned partitions also raid 10 ( don't ask !! ) Again I've not been able to measure any performance difference. This is using about 32GB of data for tests.
To be honest it's not very often a DBA gets to be able to do these types of tests:- what I seem to be seeing is a lack of proof for many of the "recommendations" I have been using for many years.
The only concrete difference I can see is that a lesser amount of spindles set as internal raid 10 will outperform all three SANs for the same test.
I stand by my original comment that 4 spindles does not really warrant any discussion; however this has been a very interesting thread.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
January 10, 2009 at 5:25 am
Not going into any SQL specifics in terms of how it will write data to your data files, filegroups, etc...
RAID 1 will give you 1x write performance and 2x+ read performance.
RAID 10 will give you faster write performance and very fast read performance
Of course, the type (ATA, SATA, SCSI, etc...) and speed of your disks (7200, 10k, 15k) as well as the controller affect this as well. Also, it doesn't take into account controller cache - speed and size. The higher percentage of time you can read from the cache, the faster your IO will be. If you have a cache miss, then it will be back to disk speed. Similarly, the more you can write just to the cache, the faster your writes will be. If you cause a cache flush due to a filled cache, then again you are back to disk speed.
You also haven't mentioned if you are using software or hardware RAID.
RAID 10 is twice the price of RAID1. However, as you've mentioned, in your case - you already have 4 disks and are contemplating 2 x RAID 1.
Your comment about failure isn't strictly correct. You can have two disks fail in your 2 x RAID 1 scenario, but only if the disks fail in the right place. ie 1 in either RAID set.
If you have a decent controller, the money and the disks, I would recommend RAID 10 over 2 x RAID 1. You will get increased read and write speeds over RAID 1 as well as increased fault tolerance.
Of course, logs should be taken into account also.
--
Andrew Hatfield
January 10, 2009 at 5:35 am
I suppose it would have helped had I seen that there are 4 pages of this thread, and read them, prior to posting. my bad.
One thing that hasn't been mentioned, at least with SQL 2005 on Windows 2003, is that each volume will get a separation SQL IO thread. Note volume; not file, not filegroup, not partition, not disk, not raid set. Each volume.
Again, variables are disk speed, size and type. Controller type, number of channels, size and speed of cache.
As always, test. Your mileage may vary.
--
Andrew Hatfield
January 10, 2009 at 9:21 am
colin Leversuch-Roberts (1/9/2009)
On one san I have aligned partitions with 64kb block size, 128kb stripe, raid 10 dedicated spindles; I also have some 4kb block size non aligned partitions also raid 10 ( don't ask !! ) Again I've not been able to measure any performance difference. This is using about 32GB of data for tests...........what I seem to be seeing is a lack of proof for many of the "recommendations" I have been using for many years.
Colin, this too is very much what i have found. For all the recommendations\best practices there are lots of reports of "no measurable gains".
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
January 12, 2009 at 7:54 am
I guess taking this along to its conclusion...
assuming you have a nice big das array of 56 disks...
Assuming you keep 2 back as h/s:
Would you want to go with 27 r1 mirrors, or 1 r10 with 54?
January 12, 2009 at 4:21 pm
RichardB (1/12/2009)
...assuming you have a nice big das array of 56 disks...Assuming you keep 2 back as h/s:
Would you want to go with 27 r1 mirrors, or 1 r10 with 54?
1) If we're still talking about the server hosting only 1 DB in a single filegroup:
a) If backup and restore performance was paramount and you don't care about ease of management, go for 27 RAID1 volumes, otherwise go for 1 RAID10 volume.
2) If however this is a real world server with a number of different DBs and filegroups with different IO patterns:
You may want 27 RAID1 volumes - this allows you to seperate the IO of different DBs or filegroups explicitly. Personally I'd prefer a mix of RAID10 (for the larger tables and to group common objects with similar IO) and RAID1 to break out specific "hot" tables that my typical query patterns use often in conjunction with the other common tables. If I had the luxury of 56 spindles I'd probably have 2 larger RAID10 volumes and a handful of RAID1 volumes with certain filegroups on them, depending on what DBs the server hosted.
Regards,
Jacob
February 18, 2009 at 3:19 pm
FYI I finally got the chance to post the results of my RAID 1\5\10 tests.
The entire series is here: http://kendalvandyke.blogspot.com/2009/02/disk-performance-hands-on-series.html
Part 6, specifically about RAID 10 vs. RAID 1, is here: http://kendalvandyke.blogspot.com/2009/02/disk-performance-hands-on-part-6-raid.html
Enjoy!
Viewing 15 posts - 31 through 45 (of 56 total)
You must be logged in to reply to this topic. Login to reply