RAID Questions

  • I just received our new SQL server hardware:

    DELL 6600

    Quad 1.4 XEON

    4 GIG ram

    4x18gig (15k rpm) drives

    8x69gig (10k rpm) drives

    We currently are on:

    Digital 7100

    Quad 200mhz Pentium Pro

    1 gig ram

    1x36 for os/apps/ftp data files/tempdb

    4x36 RAID1+0 (Mylex controller) for data and logs

    The largest db is 63-65 gig (60 gig with 5% free space) with 256 user tables.

    Usage:

    This database is mainly a data warehouse - we get daily/weekly/monthly raw data files from our mainframes. The data is imported daily/weekly/monthly then MAINLY used for reporting and powering web apps.

    When we get the new raw data we do either of the following processes:

    1. Drop the db table -> create a new table -> import the new data -> build indexes

    This is done for the raw files we get that are FULL extracts of the mainframe files

    2. Drop a daily table -> create a daily table -> import the new data to the daily table -> match keys between daily table and master table if in both delete record from master -> import all data from daily table into the master table

    This process is done on the larger mainframe files where a full dump of them would be over 150,000 records - when we do this the raw data files we get daily contains all the new/updated data in the last 5 days (covers holidays)

    These files are basically about 10% of the current master table size.

    These imports run mainly between 5:30AM and 9:00AM with users starting to use the system around 7:30AM. I am hoping I can find a set-up to help improve the amount of time needed for the imports to run.

    So what would be the best drive set-up:

    Channel 0: 2x18 RAID1 OS and apps

    Channel 0: 2x18 RAID0 tempdb and storage of raw text files used for the imports

    Channel 1: 8x69 RAID5 data+logs

    or

    Channel 1: 4x69 RAID1 logs (do I need that much space for logs??)

    Channel 1: 4x69 RAID5 data

    or

    Channel 1: 4x69 RAID5 logs

    Channel 1: 4x69 RAID5 data

    then span to RAID50

    or

    etc etc

    Thanks,

    Randy

    Edited by - notrub on 01/10/2003 04:55:18 AM

  • What's the size of your log after the 5:30AM and 9:00AM loads, but before the backup hits? That should answer your question on how much space you need.

    At first clance, this is how'd I configure it.

    Channel 0: 2x18 RAID1 OS and apps

    Channel 0: 2x18 RAID0 tempdb and storage of raw text files used for the imports

    Channel 1: 2x69 Raid0 logs

    Channel 1: 6x69 RAID5 data

    John Zacharkan


    John Zacharkan

  • NO RAID 0. Not worth it on a production server. If one drive goes, your server is down. The tempdb should be on RAID 1 or 5 or 10, same for db logs.

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones

    http://www.dkranch.net

  • Here is what I got going right now for a test on the server:

    2x18 as RAID1

    2x18 as RAID1

    These are spanned so it is really RAID10

    This is for the os/apps/raw data files

    2x69 as RAID1 for tempdb and ldf files

    5x69 as RAID5 for data files

    1x69 as Global Hot Swap

    Is there any issue with having the tempdb and ldf files on the same set of raid drives??

    Thanks,

    Randy

  • From the security point of view you should have your log files on separate physical disks.

    From the performance pont of view the best is to separate the system databases (master, model, msdb), tempdb, log and data.

    If you have more physical disks you could separate from the data the (nonclustered) indexes as well.

    Even better if you know exactly the usage of your tables you could separate the most used tables on separate disks among with the least used one.

    The basic idea is to use paralell as many spindle as you can



    Bye
    Gabor

  • Steves' absolutely right, not sure where my head was before.

    I like Raid 1 on my logs and backups. But I only recommend this if it's Hardware implemented. Don't use Raid 1 if it's controlled through software. A system software implemented raid 1 setup will eat at CPU/Server and possibly degrading throughput at peak times. Here's a quick break down of Steve's picks, all are good just a matter of $$$ and needs.

    Advantage of 1

    Twice the Read transaction rate of single disks, same Write transaction rate as single disks. 100% redundancy of data means no rebuild is necessary, just a copy to the replacement disk. Transfer rate per block is equal to that of a single disk. Simplest RAID storage subsystem design

    Disadvantages of 1

    Highest disk overhead of all RAID types (100%) - inefficient

    Advantages of 5

    High read tran rate and a medium write tran rate

    Low ratio of ECC (Parity) disks to data disks means high efficiency. It has an excellent aggregate transfer rate.

    Disadvantage of 5

    Very complex controller design. Difficult to rebuild in the event of a disk failure (as compared to RAID level 1). Individual block data transfer rate same as single disk.

    Advantages of Raid 10

    Identical to 1, but with a higher IO performance gain.

    Disadvantages of Raid 10

    Expensive / High overhead, limited scalability.

    but at a much higher cost. Very limited in scalability.

    John Zacharkan


    John Zacharkan

  • You get no bennifits from seperate drives on the same RAID controller and same channel because access for the controller will switch between each RAID on the same channel and thus writes are asynchronous.

    Also, keep in mind the page file will be constantly writting data to the hard drive as well so you do not want to mix data or log file on the drive or controller with it.

    Now if you can afford, RAID 10 for the data/log drive is better as it will have the fewest actual writes and reads to get the data on the drive and off and actions within the same array occurr synchronously across all drives and mirroed to the other set.

    RAID 5 is cheaper thou in that you lose the size of one drive not half the drives. On the DELLs rebuilding a RAID5 drive is a snap (have done twice and we use the PowerEdge servers). The other difference is the way data is written, it will take far more writes to put the data on the drive as the data is written (synchronously) the parrity has to be calculated and the parity written.

    If you have more than one controller and each has more than one channel things change and you plans as suggested by Zac (with RAID1 in place of RAID0 for tempdb and other files) is best.

    But you will not get and throughput bennifit with tweo arrays on the same channel on the same controller.

  • Based on the SQL Performance Document I figured there would/could be a channel conflict but the people who do our buying did not want to listen to me. And just said - its better than what you have -

    I wish I had more channels but I don't so I am looking for the best set-up for what we have -

    one controller

    4x18g

    8x69g

    today I had to fight with the configuration guy to not RAID 0 two of the 18s plus not use all 8 as RAID 5 with no spare since they did not buy any spare drives to have sitting around

    Currently the tempdb and the ldf file of the 60gig DB are on the same 10gig partition of a single 36g drive and I have not had any space issues when the import of new data is running

    The note is the raw data files total 16gig of space - they get backed up to tape each night then overwritten the next day - so I need to have a large partition to store these files and right now I have placed these on a seperate array on a seperate channel so these can be read seperate then the db writes, good idea or not??

  • I think I would consider doing

    RAID5 4x18GB with OS/SWAP/Raw Data

    If not a lot of writes then

    RAID5 8x69 GB Drives Data/Logs (tempdb also)

    if high number of writes and you can afford the lost space

    RAID10 4xGB drives (4xGB mirrored set) same as above.

    If you choose to keep a drive out for safety (I would try to purchase, price isn't that bad), then RAID5 or keep 2 out if RAID10.

    If you config guy gives you flak about RAID5 instead of 2 RAID0 goto your boss with the explination of RAID0 and that this configuration means lose of a drive means server down and if is OS drive then complete rebuild is neccessary.

    Sometimes you have to play the political game. Do make it a written document so that if nothing is done and RAID 0 is implemented you have something to cover your @$$ if it fails.

    Edited by - antares686 on 01/10/2003 6:46:58 PM

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply