May 5, 2011 at 9:24 am
I want to add a partitioning scheme to a a few tables in a vldb and while I have found lots of great articles, blogs, etc on the topic, I have yet to find anything that addresses my needs.
The tables all have a clustered index on a datetime column, but I want the partitition range based on numbers of days rather than specific dates, i.e.,
partition 1: 0-7 days
partition 2: 8-30 days
partition 3: 31-90 days
partition 4: 91-120 days
partition 5: 120-180 days.
Is this possible? Thanks!
May 5, 2011 at 11:16 am
Check out the following:
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
May 5, 2011 at 4:00 pm
Read the post but don't see how it applies.
The persisted column used in the post is based on a deterministic value, whereas a number of days range from today is not.
May 5, 2011 at 4:14 pm
I believe that you want to partition on a computed column which is based on your clustered Index.
So if you create you computed column, update it and create a partition based on the computed column I do not see why that would work for you?
Regards.
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
May 5, 2011 at 4:15 pm
No, not directly. You'd have to run Alter Partitions on a regular basis. Considering your desired partition scheme, I'd probably do it on weekends.
Check out these urls:
http://stackoverflow.com/questions/5140566/partition-table-based-on-variable-value-sql-server-2008
And a lead in from there to a blog post on partitioning, specifically this quote:
http://shannonlowder.com/2010/08/partitioning/
Every partition function you will define requires a name, a data type, and at least one boundary point. Iβd like to point out one thing: you canβt define a partition function on a text, ntext, image, varbinary(max), timestamp, xml, varchar(max), or user-defined datatypes. Oh, and if you use a computed column, that column will need to be persisted in order to partition on it. Since the computed column has to be persisted, that means it has to be deterministic too (no variable values allowed).
GETDATE() counts as a variable value and is non-deterministic.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
May 5, 2011 at 4:18 pm
Thanks for the help, but I would have to be updating hundreds of millions of rows each day (or even once a week) - ugly!
May 5, 2011 at 4:20 pm
duncfair (5/5/2011)
Thanks for the help, but I would have to be updating hundreds of millions of rows each day (or even once a week) - ugly!
The way you were looking at doing it was second by second shifts of the existing data, even worse. π (EDIT: With GETDATE() that is)
Another option is to simply do shifting partitions. Simply start going to week to week partition inclusions, after breaking up the older data. I'll have to find some sample code if you want me to show you how to do it, been a while since I really played with that.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
May 5, 2011 at 4:36 pm
Not second by second, a computed column that bucketed rows with a case statement like:
when datediff(d, datecolumn, getdate()) between 0 and 7 then 1
when datediff(d, datecolumn, getdate()) between 8 and 14 then 2
when datediff(d, datecolumn, getdate()) between 15 and 60 then 3, etc.
Still, shifting partitions seems to be the only way to go.
Thanks!
May 5, 2011 at 4:57 pm
duncfair (5/5/2011)
Still, shifting partitions seems to be the only way to go.
I'm still re-wrapping my head around partitions atm, but this here looks like a good way to deal with sliding partitions:
I can't speak for the accuracy/ease of use until I can de-frag my brain. π
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
May 5, 2011 at 5:10 pm
duncfair (5/5/2011)
Not second by second, a computed column that bucketed rows with a case statement like:when datediff(d, datecolumn, getdate()) between 0 and 7 then 1
when datediff(d, datecolumn, getdate()) between 8 and 14 then 2
when datediff(d, datecolumn, getdate()) between 15 and 60 then 3, etc.
Still, shifting partitions seems to be the only way to go.
That's right, as Craig says, there's no way to do exactly what you are after here automatically. That's not just SQL Server being awkward: there's no mechanism to move rows between partitions based on time passing. Any such mechanism would have to 'watch the clock' and magically spring into life and move rows around. If you think about it, the practicalities of that would make it a performance nightmare.
But it's not all bad news. The question to ask is: why do you want to do it this way? If it is to optimize performance (a bit) or make data management easier, what is stopping you adopting a more traditional (sliding window) solution? Partition the table by date (not time). So queries over any date range will just hit the minimum number of single-date partitions. In all the scenarios I can think of, this partitioning scheme gives you all the benefits you are after, while keeping maintenance and overhead to a minimum. For older data, you can use larger 'archive' partitions - perhaps stored on slower (= cheaper) storage - assuming your users are much less likely to query old data.
You can read more about sliding window partitioning in these links:
Implementing Sliding Window Partitioning
SQL Server 2005 Partitioning - Kimberly Tripp (PDF)
The Data Loading Performance Guide
Partitioned Table and Index Strategies Using SQL Server 2008 (Word document)
edit: fixed link
May 5, 2011 at 5:15 pm
^ Listen to this man, he's one of your best bets for partitioning help 'round these parts! :w00t:
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
May 6, 2011 at 1:43 am
We've had to do the same thing here (date ranges were a bit different though), we ended up doing what was suggested above, a partition a day with jobs taking care of the archiving and the sliding.
The application needed to be able to query specific ranges and the dev didn't want to change it so we just used an aligned view we recreated each morning, that way the application could query the view and not have to worry about the underlying partitions.
Older partitions were moved to bigger bucket with less indexes as they were less used and after a few years they were dropped.
May 6, 2011 at 9:12 am
Oliiii (5/6/2011)
The application needed to be able to query specific ranges and the dev didn't want to change it so we just used an aligned view we recreated each morning, that way the application could query the view and not have to worry about the underlying partitions.
Excellent suggestion!
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply