May 22, 2012 at 10:02 am
Hi
We are planning to partition a table based on a column containing year.There will be 3 partitions: one for current year, one for previous year and one for remaining years(containing data which is usually not queried).
Now we plan to keep all partitions in same filegroup on same disk as we don't hae option for additional disk.
My question is, will it boost the performance of my queries considering that most of the queries are fired on current year data.
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
May 22, 2012 at 10:07 am
No. Partitioning is not a performance enhancer, it is to make maintenance easier and more manageable. There is no reason for you to partition this data.
Jared
CE - Microsoft
May 22, 2012 at 10:31 am
S_Kumar_S (5/22/2012)
one for remaining years(containing data which is usually not queried).
Not ideal as merging partitions would be an IO intensive operation, you would be better off keeping the extra partitions.
S_Kumar_S (5/22/2012)
Hiwill it boost the performance of my queries considering that most of the queries are fired on current year data.
Not normally no.
Do you have any intention of archiving this data at any point? If so partitioning would be an ideal solution and enable you to use a sliding window (add a partition, archive another).
May 22, 2012 at 11:06 am
Yes. The concept is called 'partition elimination'. Especially when scanning the table for a date range, the optimizer will realize it only has to scan the current year partition.
May 22, 2012 at 11:39 am
andersg98 (5/22/2012)
Yes. The concept is called 'partition elimination'. Especially when scanning the table for a date range, the optimizer will realize it only has to scan the current year partition.
I find that more often than not, this is not the case. There are so many other factors that can affect the performance that I find partition elimination to be negligible. If there are no other reasons outside of partition elimination to implement partitioning, I wouldn't expect to see a benefit.
Jared
CE - Microsoft
May 22, 2012 at 2:02 pm
andersg98 (5/22/2012)
Yes. The concept is called 'partition elimination'. Especially when scanning the table for a date range, the optimizer will realize it only has to scan the current year partition.
That's valid if you're converting a full table scan into a scan of one or more partitions. However for that to be the case it means that there are no useful indexes for that query and a better approach than partitioning would be to evaluate and add/modify indexes.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 22, 2012 at 2:47 pm
GilaMonster (5/22/2012)
andersg98 (5/22/2012)
Yes. The concept is called 'partition elimination'. Especially when scanning the table for a date range, the optimizer will realize it only has to scan the current year partition.That's valid if you're converting a full table scan into a scan of one or more partitions. However for that to be the case it means that there are no useful indexes for that query and a better approach than partitioning would be to evaluate and add/modify indexes.
I'll second this. The best way to think of partitioning for optimization is as if you were changing the leading edge of an existing index. You would be as well served simply adjusting the index as you would partitioning.
Now, if you were manipulating entire partitions simultaneously or archiving or some such, then it gets you performance boosts.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
May 22, 2012 at 3:13 pm
Partitioning could indirectly help performance, as follows:
-- make it possible/faster to rebuild/reorg current yr's data only;
-- each partition can be separately specified as uncompressed/compressed (SQL 2008+ only).
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 22, 2012 at 3:23 pm
The below link shud be a good place to start
http://msdn.microsoft.com/en-us/library/ms177411%28SQL.105%29.aspx
Partitioning is best for performance when placed on multiple drives with a suitable raid level.
else the best way to save IO is to use compression etc.
May 23, 2012 at 8:39 am
The OP asked if his plan to partition the table could improve performance and I stated a fact about what he was proposing. Was I wrong?
I agree that looking at indexes would provide as good or better performance boost to the queries but that was not the question.
May 23, 2012 at 8:55 am
andersg98 (5/23/2012)
The OP asked if his plan to partition the table could improve performance and I stated a fact about what he was proposing. Was I wrong?I agree that looking at indexes would provide as good or better performance boost to the queries but that was not the question.
Knowing what we know from the OP:
1. 3 partitions
2. All in same filegroup and disk
I would say that it is not a fact that partitioning would increase performance simply by partition elimination. It might, but it us not a fact. Also, being that there are no other reasons other than performance mentioned by the OP to partition, I recommend against it without more information.
Jared
CE - Microsoft
May 23, 2012 at 10:32 am
It will help specially if you can provide the parttitoning columns. Not just this eliminates the partition but it might help in other indexes as well. This will reduce the index height for other indexes( this is based on data) thus using those indexes will be cheaper. But all of this depends how well you write your queries.If all of them somehow eliminates partition then there will be performanc eenhancement. Similarly the index scans would take around 1/3rd IO and cpu as it has to read 1/3rd data based on parttiton eliminatioon.
If you can not use the partition elmination.. Then i guess it depends...
drop table mytable_part
go
drop partition scheme ps_mytable
go
drop partition function pf_mytable
go
create partition function pf_mytable(int)
as range left for values (10,20,30,40)
go
create partition scheme ps_mytable
as partition pf_mytable all to ( [primary])
go
create table mytable_part (id int, mycol char(10) not null default 'mycol',uselesscol char(200) not null default 'uselesscol')
on ps_mytable(id)
go
/*
select * from sys.partition_functions
select * from sys.partition_range_values
select * from sys.partitions
*/
go
drop table mytable_unpart
go
create table mytable_unpart (id int, mycol char(10) not null default 'mycol',uselesscol char(200) not null default 'uselesscol')
go
insert into mytable_unpart with(tablockx) (id,mycol)
select NTILE(50) over( order by cnt),CONVERT(varchar,cnt%200000) from Nums where cnt <= 100000
go
create clustered index mytable_unpart_id on mytable_unpart(id)
go
create nonclustered index mytable_unpart_mycol on mytable_unpart(mycol)
go
--select id,COUNT(*) from mytable_unpart group by id
insert into mytable_part with(tablockx) (id,mycol)
select NTILE(50) over( order by cnt),CONVERT(varchar,cnt%200000) from Nums where cnt <= 100000
go
create clustered index mytable_part_id on mytable_part(id)
go
create nonclustered index mytable_part_mycol on mytable_part(mycol)
go
select * from sys.dm_db_index_physical_stats(db_id(),object_id('dbo.mytable_unpart','U'),null,null,null)
go
select * from sys.dm_db_index_physical_stats(db_id(),object_id('dbo.mytable_part','U'),null,null,null)
go
select MAX(mycol) from mytable_unpart where id <= 10
go
select MAX(mycol) from mytable_part where id <= 10
go
dbcc show_statistics (mytable_part,mytable_part_mycol)
go
select MAX(uselesscol) from mytable_unpart where mycol = CONVERT(char(10),12942)
go
select MAX(uselesscol) from mytable_part where mycol = CONVERT(char(10),12942)
go
select MAX(uselesscol) from mytable_unpart where mycol = CONVERT(char(10),12942) and id<=10
go
select MAX(uselesscol) from mytable_part where mycol = CONVERT(char(10),12942) and id <= 10
go
GulliMeel
Finding top n Worst Performing queries[/url]
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw[/url]
May 23, 2012 at 10:46 am
I believe that partition elimination does not guarantee any performance enhancement. Sure, it "might," but it might not. What if the other partitions are empty? What if the amount of data in each partition is skewed significantly? In this case, what if a report needs to be run (even if only rarely) on last year's data?
When deciding on partitioning, you have to start by asking what is gained. If the only answer is "possible performance gain," to me that is the same as "nothing." If a potential performance gain is a potential bonus to helping the management of the data, then sure.
In this case, we know only 3 things. That the data will be partitioned on year, there will only be 3 partitions with 1 being the current year, and these partitions will all reside on 1 filegroup on 1 disk. Given that information it is certainly not a fact that partitioning will improve performance of anything. It is more likely to hurt performance of those queries that do not properly reference the partitioned table or that reference years other than the current one than it is to increase performance of the current year. Again, with the information given...
Jared
CE - Microsoft
May 23, 2012 at 11:24 am
In this case, what if a report needs to be run (even if only rarely) on last year's data?
I think here partition elimination will sure be used and will help in performance gain if he can provide the parttion key column.
Emty partition and too much skewed data is more of design issue related ot partitioning. Properly designed partitoning will provide you performance definetly.But if you are just thinking that creating 10 partition on a table randomly will provide you performance then i doubt that. But again proper design and planning is key to successs anywhere not just the partitioning.
If you are going to query a particular setof data much more than any other set then surely partition will help.
If we take this given case, partitioning depends on the usual access pattern of data say if the queries are accessing
90% current year data , 5 % prev year data, 3 % 2 years before data and 2% is where he will be using all the data.
I would suggest that go for partitoning specially if you could change your queries so that partition key column is used as sarg.
But in the end the best answer is "It depends". Unless he is more specific with the amount of data for each year,type of queries against this data.What % of queries will use what data etc...
GulliMeel
Finding top n Worst Performing queries[/url]
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw[/url]
May 23, 2012 at 11:29 am
Gullimeel (5/23/2012)
Properly designed partitoning will provide you performance definetly.
Not necessarily. Properly done indexing can provide just about the same benefit. Partitioning may give you somewhere around an additional 1% performance improvement over just indexing, but is that really worth it?
p.s. That's based on testing, not a wild guess
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 38 total)
You must be logged in to reply to this topic. Login to reply