January 24, 2012 at 6:59 pm
....
Hello gurus,
I need your comments on the table partitioning we implemented in our environment.
we have a pk column (INT), Partitioning column(Datetime), fkcolumn (INT), few other varchar columns
We created a clustered index on a Pk and partitioning column. Reason: our partitioning column is updated freequently. so inorder to avoid page splits, we created a clustered index on pk and partitioning column. please comment on this approach. Also, let me know if i have any misconception here.
When I try to alteer table to create a non clustered primary key on a fkcolumn it is complaining that partitioning column should be part of the non clustered pk. As per my understanding, every non-clustered index will and should have the clustered index column/s as part of the non clustered index. If that understanding is correct, then, it should not complain for the partitioning column not being part of the non clustered pk.
Please clarify my misconception.
Thanks
Jagan K
January 24, 2012 at 7:19 pm
jvkondapalli (1/24/2012)
....Hello gurus,
I need your comments on the table partitioning we implemented in our environment.
we have a pk column (INT), Partitioning column(Datetime), fkcolumn (INT), few other varchar columns
We created a clustered index on a Pk and partitioning column. Reason: our partitioning column is updated freequently. so inorder to avoid page splits, we created a clustered index on pk and partitioning column. please comment on this approach. Also, let me know if i have any misconception here.
When I try to alteer table to create a non clustered primary key on a fkcolumn it is complaining that partitioning column should be part of the non clustered pk. As per my understanding, every non-clustered index will and should have the clustered index column/s as part of the non clustered index. If that understanding is correct, then, it should not complain for the partitioning column not being part of the non clustered pk.
Please clarify my misconception.
jvkondapalli, what is the relationship between your PK column (presumably an ID) and your partitioning column?
Do you expect data from the past to show up later? (Meaning, a date say one month back being assigned a new ID as the current top ID +1?)
January 24, 2012 at 8:28 pm
Revenant,
....
Hello gurus,
I need your comments on the table partitioning we implemented in our environment.
we have a pk column (INT), Partitioning column(Datetime), fkcolumn (INT), few other varchar columns
We created a clustered index on a Pk and partitioning column. Reason: our partitioning column is updated freequently. so inorder to avoid page splits, we created a clustered index on pk and partitioning column. please comment on this approach. Also, let me know if i have any misconception here.
When I try to alteer table to create a non clustered primary key on a fkcolumn it is complaining that partitioning column should be part of the non clustered pk. As per my understanding, every non-clustered index will and should have the clustered index column/s as part of the non clustered index. If that understanding is correct, then, it should not complain for the partitioning column not being part of the non clustered pk.
Please clarify my misconception.
jvkondapalli, what is the relationship between your PK column (presumably an ID) and your partitioning column?
Do you expect data from the past to show up later? (Meaning, a date say one month back being assigned a new ID as the current top ID +1?)
1. PK is a running number and partitioning column will get updated when that record is processed. Practically, in our system, that same record could
be processed a month later. This is the reason why we are having 3 months of data. on the 4th month, we will switch out the oldest parttion and
merge it.
2. I dont want that data to show up with the new ID. the running number should be same.
Thanks
Jagan K
January 25, 2012 at 3:11 pm
jvkondapalli (1/24/2012)
Revenant,....
Hello gurus,
I need your comments on the table partitioning we implemented in our environment.
we have a pk column (INT), Partitioning column(Datetime), fkcolumn (INT), few other varchar columns
We created a clustered index on a Pk and partitioning column. Reason: our partitioning column is updated freequently. so inorder to avoid page splits, we created a clustered index on pk and partitioning column. please comment on this approach. Also, let me know if i have any misconception here.
When I try to alteer table to create a non clustered primary key on a fkcolumn it is complaining that partitioning column should be part of the non clustered pk. As per my understanding, every non-clustered index will and should have the clustered index column/s as part of the non clustered index. If that understanding is correct, then, it should not complain for the partitioning column not being part of the non clustered pk.
Please clarify my misconception.
jvkondapalli, what is the relationship between your PK column (presumably an ID) and your partitioning column?
Do you expect data from the past to show up later? (Meaning, a date say one month back being assigned a new ID as the current top ID +1?)
1. PK is a running number and partitioning column will get updated when that record is processed. Practically, in our system, that same record could
be processed a month later. This is the reason why we are having 3 months of data. on the 4th month, we will switch out the oldest parttion and
merge it.
2. I dont want that data to show up with the new ID. the running number should be same.
Any thoughts from the experts? your valuable suggestions/comments are deeply appreciated.
Thanks
Jagan K
Thanks
Jagan K
January 25, 2012 at 3:17 pm
I think the first question is "does having the primary key and clustered index on an identity column and the partitioned column of datetime eliminate page splits when new data is entered?"
From my understanding, I don't see how this prevents page splits. (Feel free to jump in gurus) In fact, I think with frequent updates on the partitioned column it will create MORE page splits. I see no need to have this column as part of the key as it does not describe any uniqueness at this point of my understanding.
Jared
CE - Microsoft
January 25, 2012 at 3:36 pm
Per my understanding, partitioning column should either be part of the clustered index or the Primary key and it should not be null. Else, it will reside left most partition.
Again, I might be wrong, please comment.
Thanks
Jagan K
January 25, 2012 at 3:45 pm
The problem is you are saying that that column gets updated frequently. So every time you update that value, it has to move the data. That may be to another page and even another file. I would hesitate partitioning on a column that gets frequent updates.
Jared
CE - Microsoft
January 25, 2012 at 6:49 pm
This script demonstrates some concepts to answer some of your questions:
USE tempdb
GO
CREATE PARTITION FUNCTION PF(date) AS
RANGE RIGHT FOR VALUES ('2012-02-01', '2012-03-01', '2012-04-01');
GO
CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY]);
GO
-- Partitioned heap
CREATE TABLE dbo.Heap
(
pk integer IDENTITY NOT NULL,
the_date date NOT NULL,
fk integer NULL,
data varchar(50) NULL
) ON PS (the_date);
GO
-- Sample data
INSERT dbo.Heap
(the_date, fk, data)
VALUES
('2012-01-04', 1001, 'a'),
('2012-01-08', 1002, 'b'),
('2012-02-21', 1003, 'c'),
('2012-03-07', 1004, 'd'),
('2012-03-15', 1005, 'e'),
('2012-03-24', 1006, 'f');
GO
-- Show partitions and row counts
SELECT
partition_id = $PARTITION.PF(the_date),
row_count = COUNT_BIG(*)
FROM dbo.Heap AS h
GROUP BY
$PARTITION.PF(the_date)
GO
-- Error: Partition columns for a *unique* index must be
-- a subset of the index *key* (included column does not count)
CREATE TABLE dbo.Clustered1
(
pk integer IDENTITY NOT NULL,
the_date date NOT NULL,
fk integer NULL,
data varchar(50) NULL,
CONSTRAINT [PKC dbo.Clustered1 pk P:the_date]
PRIMARY KEY (pk) ON PS (the_date)
);
GO
-- Same error, for non-clustered *unique* index (PK)
CREATE TABLE dbo.Clustered2
(
pk integer IDENTITY NOT NULL,
the_date date NOT NULL,
fk integer NULL,
data varchar(50) NULL,
CONSTRAINT [PKN dbo.Clustered2 pk P:the_date]
PRIMARY KEY NONCLUSTERED (pk) ON PS (the_date)
);
GO
-- Start as an ordinary heap with a nonclustered PK
CREATE TABLE dbo.Clustered3
(
pk integer IDENTITY NOT NULL,
the_date date NOT NULL,
fk integer NULL,
data varchar(50) NULL,
CONSTRAINT [PKN dbo.Clustered3 pk]
PRIMARY KEY NONCLUSTERED (pk)
);
GO
-- Partitioned non-unique clustered index is allowed
-- Note: the existing nonclustered PK stays NOT partitioned
CREATE CLUSTERED INDEX [CX dbo.Clustered3 fk P:the_date]
ON dbo.Clustered3 (fk) ON PS(the_date);
GO
DROP TABLE dbo.Heap;
DROP TABLE dbo.Clustered2;
DROP TABLE dbo.Clustered3
DROP PARTITION SCHEME PS;
DROP PARTITION FUNCTION PF;
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply