Partitioned Table Performance

  • I'm a complete newbie to table partitioning, so this question may be silly.

    I was asked to investigate the possibility of table partitioning by a developer.

    I used a 7 gig table with 7 million rows 7 gig size

    I partitioned on the same disk into 16 files (one for each quarter of the year that the data is in including 6 mostly empty files to allow growth over this year and next).

    The table has one clustered index based on a julian date which is stored as a numeric 18,0.

    I have two copies of the table, one partitioned one not. All files stored on the same disk and the tables are identical in everyway.

    I don't know about inserts and deletes, performance of these is not the primary concern, but select performance on the partitioned table is about 20-30% slower than the unpartitioned.

    Why? Is it because all files are on the same disk? is there no gain without seperating the files?

    thanks in advance

    Gareth

  • Is the date range in your query causing you to go against multiple partitions or just one? I would be curious to know if you are seeing parallelism in the execution plan for the partitioned table select.

    Partitioning is great when used in the right scenarios, but it is not going to be faster in every instance. So, there has to be some trial in order to understand how the engine is going to use your scheme based on your basic data retrieval requirements. Additionally, having all the partitions on the same disk may be causing some of those issues if it is trying to read from multiple partitions and it is fighting for spindle time.

    I would take some very close looks at the query plans to see what the engine is doing with the queries you are running.

    There is a function that will return the partition identifier along with each row so that you can see where the data is coming from. I can't seem to think of what it is right now and I don't have my work laptop up so, I will have to post that later if you would like it. I found that helpful when I first started looking at partitioning.

    David

    @SQLTentmaker

    β€œHe is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • select performance on the partitioned table is about 20-30% slower than the unpartitioned.

    Most probably because of your partitioning strategy in hurting the query performance.

    If your query is not using your partition key to limit the number the partitions to be searched you are certainly adding overhead... remember, a partitioned table is nothing but a collection of tables the RDBMS show as a single object... but it isn't.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Thanks for your replies guys.

    After reading your responses I tried a few more things.

    I've tried queries which will only query one partition they are slower.

    I then wrote queries which will hit multiple partitions they take the same amount of time, partitioned or non partitioned.

    In the where clause I've refered only to the field which is the clustered index and which the table is partitioned over.

    I don't know. Should see some gain.... disappointing.

  • Partitioning is a tool. And like all other tools it is only as useful as your knowledge and skill at implementing/using it. There are too many variables in this situation for us to effectively guide you. If you want high-end performance from a high-end system and high-end features, best best is to get a professional to configure these things for you while mentoring you in their usage/admin.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • One simple question.

    The main advantage of partitioning is the parellel processing you gain from it correct?

    If so, then perhaps I need to make my partitions smaller so that multiple partitions are hit by the query.

    I'm also going to try marking the filegroups read only.

    As for hiring a 'professional' that isn't going to happen without the demonstration of a clear benfit. I have to figure this out for myself

  • Gareth (8/5/2008)


    One simple question.

    The main advantage of partitioning is the parellel processing you gain from it correct?

    If so, then perhaps I need to make my partitions smaller so that multiple partitions are hit by the query.

    I'm also going to try marking the filegroups read only.

    As for hiring a 'professional' that isn't going to happen without the demonstration of a clear benfit. I have to figure this out for myself

    1) No, the primary benefit of partitioning is the management advantages. If you do it correctly you CAN see tremendous performance gains. Those will be even better in SQL 2008, which is much better at partition elimination, etc.

    2) Can you explain why you "have to figure it out for yourself"? Consider tracking exactly how many manhours you spend pursuing your solution (as well as how effective it works). The RIGHT consultant will absolutely provide you with a good ROI for an effort such as this - AND you get the benefit of mentoring from him/her and thus learning to be better at interacting with SQL Server in a much more effective manner than you would going your own route. Just my $0.02 worth. I regularly have clients contact me AFTER wasting hours/days for something I address in minutes/hours. And no, I am not bucking for work here - I am overloaded as it is. 😎 Note that I do applaud your drive to become a better SQL Serverite!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • πŸ˜€ I agree with you. I won't be able to gain additional funding for this project and my time is not billable, so its considered already paid for πŸ™‚

    I know what you're saying and I agree with you unfortunatly the money is not in my hands. Thanks for your response.

  • Gareth (7/31/2008)


    I'm a complete newbie to table partitioning, so this question may be silly.

    I was asked to investigate the possibility of table partitioning by a developer.

    I used a 7 gig table with 7 million rows 7 gig size

    I partitioned on the same disk into 16 files (one for each quarter of the year that the data is in including 6 mostly empty files to allow growth over this year and next).

    The table has one clustered index based on a julian date which is stored as a numeric 18,0.

    I have two copies of the table, one partitioned one not. All files stored on the same disk and the tables are identical in everyway.

    I don't know about inserts and deletes, performance of these is not the primary concern, but select performance on the partitioned table is about 20-30% slower than the unpartitioned.

    Why? Is it because all files are on the same disk? is there no gain without seperating the files?

    thanks in advance

    Gareth

    The problem is... it's not yet a partitioned table... just a bunch of tables screwed together with a loose view. In order to be a partition table, you must have a CHECK constraint that limits the date range (or other range) of each table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It's defined using boundary points which split it over the physical files. I've checked and the data is residing on the appropriate physical files. For the specified boundary points.

    The data is partitioned accross different physical files.

  • There are a huge number of rules for the partitioning column and the partitioned table in Books Online... my recommendation would be to make a checklist of what needs to be done according to BOL and double check your table, indexes, etc.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply