variable partitioning scheme question

  • I want to add a partitioning scheme to a a few tables in a vldb and while I have found lots of great articles, blogs, etc on the topic, I have yet to find anything that addresses my needs.

    The tables all have a clustered index on a datetime column, but I want the partitition range based on numbers of days rather than specific dates, i.e.,

    partition 1: 0-7 days

    partition 2: 8-30 days

    partition 3: 31-90 days

    partition 4: 91-120 days

    partition 5: 120-180 days.

    Is this possible? Thanks!

  • Check out the following:

    http://social.msdn.microsoft.com/Forums/en-US/sqldatawarehousing/thread/9b9f5812-400e-42aa-95b6-eacc03381272

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Read the post but don't see how it applies.

    The persisted column used in the post is based on a deterministic value, whereas a number of days range from today is not.

  • I believe that you want to partition on a computed column which is based on your clustered Index.

    So if you create you computed column, update it and create a partition based on the computed column I do not see why that would work for you?

    Regards.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • No, not directly. You'd have to run Alter Partitions on a regular basis. Considering your desired partition scheme, I'd probably do it on weekends.

    Check out these urls:

    http://stackoverflow.com/questions/5140566/partition-table-based-on-variable-value-sql-server-2008

    And a lead in from there to a blog post on partitioning, specifically this quote:

    http://shannonlowder.com/2010/08/partitioning/

    Every partition function you will define requires a name, a data type, and at least one boundary point. I’d like to point out one thing: you can’t define a partition function on a text, ntext, image, varbinary(max), timestamp, xml, varchar(max), or user-defined datatypes. Oh, and if you use a computed column, that column will need to be persisted in order to partition on it. Since the computed column has to be persisted, that means it has to be deterministic too (no variable values allowed).

    GETDATE() counts as a variable value and is non-deterministic.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thanks for the help, but I would have to be updating hundreds of millions of rows each day (or even once a week) - ugly!

  • duncfair (5/5/2011)


    Thanks for the help, but I would have to be updating hundreds of millions of rows each day (or even once a week) - ugly!

    The way you were looking at doing it was second by second shifts of the existing data, even worse. πŸ™‚ (EDIT: With GETDATE() that is)

    Another option is to simply do shifting partitions. Simply start going to week to week partition inclusions, after breaking up the older data. I'll have to find some sample code if you want me to show you how to do it, been a while since I really played with that.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Not second by second, a computed column that bucketed rows with a case statement like:

    when datediff(d, datecolumn, getdate()) between 0 and 7 then 1

    when datediff(d, datecolumn, getdate()) between 8 and 14 then 2

    when datediff(d, datecolumn, getdate()) between 15 and 60 then 3, etc.

    Still, shifting partitions seems to be the only way to go.

    Thanks!

  • duncfair (5/5/2011)


    Still, shifting partitions seems to be the only way to go.

    I'm still re-wrapping my head around partitions atm, but this here looks like a good way to deal with sliding partitions:

    http://www.sqlskills.com/resources/whitepapers/partitioning%20in%20sql%20server%202005%20beta%20ii.htm#_Toc79339965

    I can't speak for the accuracy/ease of use until I can de-frag my brain. πŸ™‚


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • duncfair (5/5/2011)


    Not second by second, a computed column that bucketed rows with a case statement like:

    when datediff(d, datecolumn, getdate()) between 0 and 7 then 1

    when datediff(d, datecolumn, getdate()) between 8 and 14 then 2

    when datediff(d, datecolumn, getdate()) between 15 and 60 then 3, etc.

    Still, shifting partitions seems to be the only way to go.

    That's right, as Craig says, there's no way to do exactly what you are after here automatically. That's not just SQL Server being awkward: there's no mechanism to move rows between partitions based on time passing. Any such mechanism would have to 'watch the clock' and magically spring into life and move rows around. If you think about it, the practicalities of that would make it a performance nightmare.

    But it's not all bad news. The question to ask is: why do you want to do it this way? If it is to optimize performance (a bit) or make data management easier, what is stopping you adopting a more traditional (sliding window) solution? Partition the table by date (not time). So queries over any date range will just hit the minimum number of single-date partitions. In all the scenarios I can think of, this partitioning scheme gives you all the benefits you are after, while keeping maintenance and overhead to a minimum. For older data, you can use larger 'archive' partitions - perhaps stored on slower (= cheaper) storage - assuming your users are much less likely to query old data.

    You can read more about sliding window partitioning in these links:

    Implementing Sliding Window Partitioning

    SQL Server 2005 Partitioning - Kimberly Tripp (PDF)

    The Data Loading Performance Guide

    Partitioned Table and Index Strategies Using SQL Server 2008 (Word document)

    edit: fixed link

  • ^ Listen to this man, he's one of your best bets for partitioning help 'round these parts! :w00t:


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • We've had to do the same thing here (date ranges were a bit different though), we ended up doing what was suggested above, a partition a day with jobs taking care of the archiving and the sliding.

    The application needed to be able to query specific ranges and the dev didn't want to change it so we just used an aligned view we recreated each morning, that way the application could query the view and not have to worry about the underlying partitions.

    Older partitions were moved to bigger bucket with less indexes as they were less used and after a few years they were dropped.

  • Oliiii (5/6/2011)


    The application needed to be able to query specific ranges and the dev didn't want to change it so we just used an aligned view we recreated each morning, that way the application could query the view and not have to worry about the underlying partitions.

    Excellent suggestion!

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply