March 1, 2009 at 7:31 am
Hi all,
We're using SQL Server 2005 Standard Edition right now. Some of the tables are 200 - 300 million rows. I'd like to partition one of these tables based on ID (less than 100 distinct IDs). Issue is we only have Standard Edition. Any thoughts on how to do so called poor man's partitioning?
Also, not all of the data in some of the other large tables is 'active' at the moment but could become active again in the future. Business rules make the record 'inactive' after a year but if the customer shows activity, they're made 'active' again. So, I would like to separate the inactive data from the active data. Is this possible somehow in MSSQL 2005 Standard Edition?
I'm looking to do these steps because I'd like to speed up specific queries that need to join between these large tables.
Thanks so much.
March 1, 2009 at 6:45 pm
Can you create a partitioned view.
No idea if this is still available in v2005 or how it's supported by standard edition.
Note it this is still on one disk you will probably get worse performance than by using indexes as a balanced tree is probably beter than you can do manually.
Cursors never.
DTS - only when needed and never to control.
March 1, 2009 at 9:43 pm
How about horizontal partitioning, distributed partitioned views, federation servers? Recommend none of them necessarily, but if you must split apart...
March 2, 2009 at 3:27 am
I would go with partitioned views. They are very effective for partition elimination in queries and are also great for data management. You'll need to plan your tables carefully and choose your constraints. There are some rules to follow if you want your view to be updatable. The altering of the view to achieve the sliding window can easily be scripted.
thanks
SQL_EXPAT
March 2, 2009 at 10:54 am
In addition to the above I would look at putting the tables on separate file groups and possibly separating out the indexes from the data.
March 2, 2009 at 1:42 pm
Thanks all for the replies. I would be using the partitioned view using tables on the same disk. I've read also that using separate filegroups on the same disk gives no benefit. In your experiences, does it actually hurt to have separate tables in separate filegroups on the same disks?
Thanks.
March 2, 2009 at 9:25 pm
It won't be any different to your current performance if all of the data is on the same disk; you might see some benefits to splitting it over multiple disks IF you are accessing the data from both tables. I presume that your queries will be specifying active only for the most part.
If you do go with partitioned views, make sure you apply the appropriate check constraints to whichever field determines if a record is active or not as the optimizer uses these constraints to determine where to look for the data.
March 3, 2009 at 2:38 am
http://weblogs.sqlteam.com/peterl/archive/2007/09/24/Horizontal-partitioning.aspx
N 56°04'39.16"
E 12°55'05.25"
March 10, 2009 at 3:16 pm
Hi All,
I've been playing with a partitioned view I created all day long. After the initial setup, the query execution plans looked great, but actual query time was just about the same as when I wasn't using the partitioned view. This leads me to not believe the estimated query cost in the estimated plan. Does anyone have any insight into why this is?
Thanks!
March 10, 2009 at 3:28 pm
Any response would simply be a guess. We don't have enough information to really tell you anything. We don't know how the data is partitioned and indexed, or what the queries look like that you are running. It would help to have this information as well as the actual execution plans.
March 10, 2009 at 3:51 pm
Lynn,
You are absolutely right, my apologies. What I should have asked was 'Has anyone had a similar experience?' Now, here are the details:
CREATE TABLE t1(
[col1] [int] NOT NULL,
[col2] [datetime] NULL,
[col3] [int] NOT NULL,
CONSTRAINT [PK_t1] PRIMARY KEY CLUSTERED
(
[col1] ASC,
[col3] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[t1] WITH CHECK ADD CONSTRAINT [CK_t1] CHECK (([col1]=(84)))
GO
ALTER TABLE [dbo].[t1] CHECK CONSTRAINT [CK_t1]
CREATE VIEW [dbo].[view] AS
select col1,col2,col3 from t1
UNION ALL
select col1,col2,col3 from t2
UNION ALL
select col1,col2,col3 from t3
Each t(i) table has exactly the same create table statement. The original table I'm getting the data from has the same create table statement as well. The original table, call it OT, has 427,000,000 records while t1 has 446,261 records.
Now, when I show the estimated query plans for
select col3
from OT
where col1 = 84
select col3
from [view]
where col1 = 84
I see the first one has a cost of 68% while the second shows 32%. Now, when I run that query several times (5+) I'm seeing very similar execution times (within a couple seconds).
Thanks.
March 11, 2009 at 8:40 am
What are the constraints on t2 and t3? If you only added a constraint to t1 then it's not doing any kind of partitioning as the view will need to look in all three base tables for the data if col1 = 84, and t2 and t3 if col1 <> 84
March 11, 2009 at 8:46 am
For t2, col1 = 85. For t3, col1 = 86. So the code looks like:
ALTER TABLE [dbo].[t2] WITH CHECK ADD CONSTRAINT [CK_t2] CHECK (([col1]=(85)))
ALTER TABLE [dbo].[t2] CHECK CONSTRAINT [CK_t2]
ALTER TABLE [dbo].[t3] WITH CHECK ADD CONSTRAINT [CK_t3] CHECK (([col1]=(86)))
ALTER TABLE [dbo].[t3] CHECK CONSTRAINT [CK_t3]
March 11, 2009 at 8:52 am
Your primary key must exist of
1) The unique column
2) The check constraint
As seen among the comment for the link I posted earlier.
N 56°04'39.16"
E 12°55'05.25"
March 11, 2009 at 8:57 am
The primary key is on col1 and col3. The check constraint is on col1 and col3 is unique in t1. Now, a value in col3 in t1 can also exist in col3 of t2. Obviously the col1 value will be different between the two tables though.
Thanks
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply