August 21, 2012 at 3:19 pm
My data files and tempDB file (single temp file) share the same RAID10 LUN and our SAN controller is saturated/overloaded. When there is extreme load put on the LUN, the server OS will lose connection to the drives and SQL crashes.
It crashes each time I run a db integrity check on my largest db even when there's minimal system activity. The OS lost connection to the SAN during production hours last week, under heavy load.
Can't buy more spindles, but have an opportunity to re-allocate spindles to build a dedicated tempDB LUN, but there's only enough drives for a 250gb'ish LUN at RAID5.
Based on my 365 day report (attached) comparing reads/writes/trans, does RAID5 seem safe for the tempDB? Overall the prod user db gets about 30% more write activity than temp, but temp peaks about 30% higher - rarely.
Should I consider a smaller LUN at RAID10 for the tempDB? It RARELY exceeds 50gb's. My goal is to relieve some pressure on the large RAID10 data LUN by moving the tempDB.
Would appreciate any thoughts or suggestions...
August 22, 2012 at 12:24 pm
After I had an actual conversation with my SAN guy, it looks like I will be able to scrape up enough spindles to run RAID10 at 200gb's for the tempDB.
I would feel more comfortable with a larger LUN, but we'll see how it goes...
August 22, 2012 at 3:20 pm
Raid 10 is a better option if you have enough disk space.The write are done faster on raid 10 beacause unlike raid 5 there's no need to calculate the parity.So the overall system performance would be degraded if you choose Raid 5 for tempdb.
Pooyan
August 22, 2012 at 3:27 pm
pooyan_pdm (8/22/2012)
Raid 10 is a better option if you have enough disk space.The write are done faster on raid 10 beacause unlike raid 5 there's no need to calculate the parity.So the overall system performance would be degraded if you choose Raid 5 for tempdb.
This now depends on your equipment and software as much as anything else. RAID 5 has been better optimized, and current speed tests (particularly on big iron with huge caches) show negligible differences under the majority of circumstances. RAID 10 is still SAFER, as you can handle a double drive loss as long as it's not the same stripe, but the speed difference has significantly dwindled. The older your equipment/software, the more likely pooyan's point will be true, however.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
August 22, 2012 at 4:08 pm
I guess a question on my mind at this point is why anyone would need a 200GB TempDB.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 22, 2012 at 4:25 pm
Jeff Moden (8/22/2012)
I guess a question on my mind at this point is why anyone would need a 200GB TempDB.
It's unfortunately not hard to need it on an overly active system that's utterly out of scope and you're not allowed to fix it. Between table spools for sorts, RCSI, reports being built off the OLTP system, and any other number of things you can slam a TempDB FAR too easily off a half a tera database. I'm staring at one that does it daily.
Vendor DB, of course. To the vendor's credit we ARE two versions behind and they've stated they never expected the system to handle our volume.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
August 22, 2012 at 10:47 pm
Some advice for the OP, if you can take down your SAN by slamming it too hard, you should consider reducing your hba queue depth to limit the number of outstanding io requests. you might take a hit on max throughput, but if the san can't handle it anyway, you're not losing much.
August 23, 2012 at 2:31 pm
SpringTownDBA (8/22/2012)
Some advice for the OP, if you can take down your SAN by slamming it too hard, you should consider reducing your hba queue depth to limit the number of outstanding io requests. you might take a hit on max throughput, but if the san can't handle it anyway, you're not losing much.
Moved it to the new LUN, but I won't know if it helped until this weekend when I run something really I/O intensive.
Will report back...
August 27, 2012 at 2:50 pm
This seems to have remedied the problem. DB integ check finished without crashing for the first time in a long time.
August 28, 2012 at 11:08 am
Thank you for reporting back - did performance improve as well as the crash frequency being reduced?
Why would you need a 200GB tempdb? Index creates and/or rebuilds on 86.2GB-1.2TB tables, of course!*
Check your metrics on the SAN - is it heavily loaded in terms of IOPS, in terms of throughput, or both? If it's IOPS, you've done well to get new spindles also on the SAN. If it's throughput, then a better solution is to provide enough IOPS and space locally, which will give you the tempdb performance you need, as well as free up SAN throughput for the rest of your (and other) uses.
Local SSD's in RAID1 or RAID5 are excellent for tempdb use with modern controllers - and yes, if you benchmark them, you'll see that RAID5 performance and RAID10 performance are pretty close, and if you're using that few spindle disks, either one with a handful of good SATA or SAS SSD's (3 100GB SSD's in RAID5, say) should trounce whatever performance you're getting now.
August 28, 2012 at 12:10 pm
Jeff Moden (8/22/2012)
I guess a question on my mind at this point is why anyone would need a 200GB TempDB.
That would be my first question
Evil Kraig F (8/22/2012)To the vendor's credit we ARE two versions behind and they've stated they never expected the system to handle our volume.
They always give this BS, it's a standard vendor response 😉
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply