Data Files and Partitioning Advice Needed

  • Hi,

    I will have to partition (and re-partition) a lot of tables in our one TB database. I was just wondering if you thought this would be a good idea or not and if you have any recommendations.

    Background Information:

    SQL Server 2008 R2 64-bit Enterprise Edition

    Partitioning by year, month, and week

    Year partitions will be page compressed

    Month and week partitions will be row compressed

    All tables involved in partitioning have more than 20 million records

    I was thinking of placing the year data files on the RAID 50 array and the month/week data files on the RAID 10 (currently, everything is on RAID 10). If I do this for each table, I could end up with about 500 data files. Should I consider just skipping the week partitions to minimize the number of data files for this database? Do you have any recommendations/advice?

    Thank you in advance!

  • Rats, no takers?

  • do you think this is overhead,you are partitioning year wise,month wise and week wise ,do you delete or archive data after 1 year or what

    Regards,

    Syed Jahanzaib Bin Hassan

    MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA

    My Blog

    http://www.aureus-salah.com

    Regards,
    Syed Jahanzaib Bin Hassan
    BSCS | MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA

    My Blog
    www.aureus-salah.com

  • You don't need separate files for each partition. I'd create a separate filegroup for the archive, RAID 50 data and move the partitions there.

    Your decision to compress things is sound, but the actual choice of row v page should be tested, and you should estimate if it's worth it. however you do it, make sure you have some automated routines that move stuff between partitions regularly so you have rolling movement of data. Be sure you factor in the load/slowdowns for this process into your maintenance schedules.

    Moving to RAID 50 for lower costs makes sense as well to me.

  • Thank you very much for your responses.

    Syed:

    I am planning on placing data in an archive filegroup for each large table. Data will not be deleted (at least not for the time being).

    Steve:

    I understand that I don't need to have separate files for each partition, but won't queries (theoretically) execute faster because there's less data in each data file for SQL Server to search? Or will SQL Server achieve faster results because of the partitioning and not the physical layout?

    Here's what I am thinking about (thanks to your help):

    Partition by year, month, and week.

    Year partitioned data (Archive, 2008, 2009, and so on) will reside in one data file on the RAID 50 array.

    Month and Week partitioned data will reside in a second data file on the RAID 10 array.

    Query execution should be faster due to the partitioning scheme/function so long as the partitioning column is used in the queries.

    So the end result will be two physical data files per table.

    Yes, I have been testing the partitioning on a backup server that has the same configurations as the production server and SAN. So far, I have experienced CPU time increasing due to the compression, logical reads decreasing (again, the compression), and query costs favoring the new partitioning scheme/function. I have yet to test out using only two data files.

    If you can be so kind as to give me better clarity in understanding this topic, I would truly appreciate it. 😀

    EDIT: How do I associate many filegroups to one data file?

  • Filegroups are about manageability and recoverability (enterprise has partial database availability). The file stack drivers don't perform better with more files. You aren't getting any parallelism there. Most databases have 1 or two files (1 MDF, 1 NDF) and perform great.

    The physical separation to different spindles gives you improvement with more files, but unless you are going to have 500 different arrays, not sure that separate files/filegroups help you.

    I would, for the sake of managability, look at access patterns. It's possible that you might want to have a filegroup for each year, and bring those online in a DR situation slowly, but you might just want one filegroup for all years and one for months/weeks.

  • Thank you, Steve!

    I have done a few tests with the different scenarios and have come up with interesting (at least to me) results!

    Test 1:

    Nothing was changed. Table is not partitioned and is located on the [PRIMARY] filegroup. The query's results are:

    CPU time = 7865 ms, elapsed time = 337 ms

    Scan count 25, logical reads 81136

    Query Cost (relative to the batch): 36%

    Test 2:

    Partitioned by year, month, week and created 15 data files/filegroups to accommodate them. Had page compression on year partitions and row compression on month/week partitions.

    CPU time = 6458 ms, elapsed time = 2074 ms

    Scan count 33, logical reads 34072

    Query Cost (relative to the batch): 23%

    Test 3:

    Partitioned by year, month, week and created two data files/filegroups to accommodate them. Had page compression on year partition and row compression on month/week partition.

    CPU time = 4679 ms, elapsed time = 735 ms

    Scan count 36, logical reads 77495

    Query Cost (relative to the batch): 41%

    So it looks like that in this case, I will have to go back to my original plan.

    Manageability-wise, I would like to maintain index maintenance on smaller sets of data to minimize the load on the server as well as DR situations. However, I am just concerned about using so many data files.

    Do you happen to know why Test 3 came out as the worst performing (in this case) test and why Test 2 performed the best? I'm just puzzled now...

  • you can break month wise and set partition function on month of 2011 like JAN,FEB,MAR then it will save these month data,in the next year 2011 no need to add more filegroup because JAN data will be same in the same file,you have to add 12 files

    Multiple files and groups will increase when you have multiple DISK not partition ,SAN drives or mutiple core processors

    if you have these resources then add 12 File Group and 24 Files ,2 for each filegroup

    Regards,

    Syed Jahanzaib Bin Hassan

    MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA

    My Blog

    http://www.aureus-salah.com

    Regards,
    Syed Jahanzaib Bin Hassan
    BSCS | MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA

    My Blog
    www.aureus-salah.com

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply