Database File Size Even Distribution

  • Hi guys,

    i have a database 400 GB in size but its spread across 7 different data files in 3 seperate volumes. I have just installed a new SAN shelf with terabyte capacity and I want to migrate this database to a new single volume ( with RAID 5). I have been told that you should try and make sure data files (.mdf, .ndf's etc) are all pretty similar in size for best performance. Is there any truth in this? My current data file sizes vary extremely. One is 235 GB , one is 60 GB, another 30 GB etc. Is it possible to redefine the structure of these files and maybe divide the 400 GB database into 4 equal files of 100 GB as oppose to having 7 different size files?

    Any help would be greatly appreciated.

    RM

  • Hello Richard,

    They usually say that having data files on different physical volumes improve performance, so it may be better to have files on 3 volumes, but I did not work with SANs. Usually they recommend to put tables and their frequently used indexes on separate volumes, tables that are frequently joined  on separate volumes, page file on its own volume, transaction log on its own volume, archive big tables by dividing them horisontally so older data that are not used on the regular basis are moved to separate tables.

    Yelena

    Regards,Yelena Varsha

  • You've been told wrong.  The size of the data files is not nearly as important as how often they are accessed, and how.  SAN's are not magic, I/O performance generally comes down to several factors.  The speed of the individual disks, and how many disks the workload is spread across.  There are many aspects to each of those, but that is about it. 

    Placing all of your files on a single raid group in a SAN doesn't sound like such a hot idea to me...  In addition, if you can't spread the files across multiple disks, there is really little point of having multiple data files.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Database files on a SAN. The benefits of the SAN are the redundancy, the fiber-channel architecture, the ability to use many disk spindles thus distrinbuting I/O more effieciently and the cache. For an initial SAN configuration one creates LUNs -  logical unit of space sliced from a RAID array. To further make things more efficient you can create Meta-LUNs - a logical unit of space sliced from many LUNs. There are optimal blocksizes that can be chosen as well. When it comes to blocksizes remember that SQL reads extents (64kb) and writes pages (8kb).

    So basics aside this is what you want to do. Database data (mdf/ndf) needs to reside on it's own Meta-Lun. In your case, after some monitiring you may want multiple data files distributed among a few Meta-LUNs. Now the transaction log (ldf) should be on it's own Meta-LUN. Finally your SQL backups on their own Meta-LUN. Additionally, your SAN may have a combination of fast disks (SCSI III - 10k & 15k rpm) and slow disks (Serial ATA 10k rpm). If you do have choices of disks by speed (usually an entire shelf of disks are identical), you want the fastest disks to be made into LUNs/Meta-LUNs for transaction logs (SCSI III 15k rpm). The second fastest disks for database data LUNs/Meta-LUNs (SCSI III 10k rpm). Then use the slowest disks for database backups LUNs/Meta-LUNs (Serial ATA 10k rpm).

    We presently have about 15 Tb of SAN space used by our various database servers (22 MS/SQL, 5 Oracle, 2 UDB2) and no performance issues !

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • Rudy,

    Do I understand correctly that when working with SANs  we should ask network Administrators (SAN admin) to create several META-LUNS  on SAN for the databases and then we will treat each Meta-LUN similar to an individual volume when working on the regular server with all rules applicable to Meta-LUN instead of physical volume. I mean for example putting transaction logs on its own Meta-Lun versus putting then on a separate volume when working on the regular server.

    Yelena

    Regards,Yelena Varsha

  • Yes. Exactly.

    G: drive --> Meta-LUN for database data, striped across 3 LUNs

    H: drive --> Meta-LUN for transaction logs. striped across 5 LUNs

    I: drive --> Meta-LUN for database backups, striped across 2 LUNs

     

    However, your SAMN Admin might not be too keen about using Meta-LUNs because it may cause him to reorganize the SAN (poor housekeeping is common where Meta-LUNs are not used).

     

    I almost forgot one of the most important aspects of SAN usage on the local server. You need dual HBA (host bus adaptors, we use Q-Logic) and dual fiber channel runs to your SAN storage switch (usually Brocade) from the server.

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • Yes and no...  This is generally the right idea, but it won't do much good if your various meta-luns are all ultimately created from the same raid arrays (which is a situation that I have seen several times.)

    Like I said, there's nothing magical about SAN's.  I/O performance still comes down to the performance of the disks themselves and how many disks the workload is spread across.  Meta-luns are simply a way of spreading the workload across more disks, but only if they are used, and used correctly. 

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Thanks to both of you. I will store these notes.

    Yelena 

    Regards,Yelena Varsha

  • Can any of you suggest how I can go about redistributing the existing data from one file evenly across multiple files?  We have a 30 gig plus file that we're trying to break up across 4 files...the 3 new files have next to no data in them, the original file has over 30 gigs in it. 

    I swear I saw a dbcc command, a stored proc, or an ALTER database command that did this automatically...

    Anyone know what I'm talking about?  Thanks.

  • BOL has an article that describes how to do this using Enterprise Manager, but not T-SQL.  Look for "switching filegroup" and you should find it.  I did this a while ago and ran a trace to see what was going on behind the scenes.  Basically EM creates a new table on the desired file/filegroup copies the data over drops the old table and renames the new one.

    I think the command you are remembering is the dbcc shrinkfile command with the EMPTYFILE option which allows you to move all data off of one file and into another, but doesn't support just moving one (or certain) table(s).

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Thank you for the prompt response!  I think you're talking about using filegroups, but I'm talking about using multiple data files, not groups at all.

    Anyone have an idea on how to redistribute data evenly across all data files for a database?

  • You don't have the ability to specify which files in a given filegroup your table will be allocated to.  If you have multiple files in a single filegroup SQL Server automatically determines where to allocate any new extents required.  SQL does a proportional fill so that the files that have the most free space will be filled at a greater rate than files that are more full, but SQL will not spread existing data pages evenly across any new files that you create.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Thanks again.  Yes, I realize that.  I'm not trying to manipulate just one table/index,etc.  I want to balance out the entire database's data files.  You are correct that new data going in just "stripes" so to speak across all available files in the file groups (just PRIMARY here).  That's good for new data, but what about existing lop-sided data?

    What I need to know is what the procedure is to redistribute the existing data evenly.

     

    Thanks.

  • There isn't one.  About the only solution to your problem is to create a new database with the appropriate files and move the data across.  The allocation would then be more even.

    You could also create new tables within the same database and move the data to them, but then when you drop the old tables it would likely cause the files to be out of balance again...

    But what's the problem with uneven files?  I don't know why it should be a concern to you.  If the concern is to spread the IO out across spindles, then you really need to use filegroups and specify where the tables/indexes should go.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Analysis on which table/indexes should be separated has not yet been done.  Latch contention is high since this is a highly concurrent application (hundreds of concurrent queries).  Multiple datafiles should help with latch contention.

    Thanks for the input.  If anyone else knows of a method, please let me know!

    Thanks.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply