July 12, 2023 at 5:41 pm
Hi all,
I'm trying to understand partition keys and how to create them. Did some googling but the samples I found it still doesn't click to me. So I have a requirement in which my user specifically requests a partition key on a date as this is what will be in the where clause of user queries. In the past I've typically used indexes on date fields like this and I've never worked with partition keys before so trying to learn them and implement them. So a little preliminary info; The records in the tables I'm working with are 500 million + (size most likely being the reason they want partition key on this date). Here are my fields (For simplistic sake I just included four fields):
ClaimLineID (PK bigint)
ClaimHeaderID (FK bigint)
ClaimLineAmount (money)
ServiceStartDate (date)
I learn best by seeing in action so I'm wondering if someone could throw a quick create table statement with these 4 fields where ServiceStartDate is being implemented as a partition key. Would also be great to see it as an alter table statement as well. This will help me to understand how to create these a new table as well as modify existing table to add it. Any help provided would be greatly appreciated. Thanks in advance!
Strick
July 12, 2023 at 6:42 pm
Why do you want to partition the table? Partitioning a table is for speeding up backup, not speeding up your queries.
(Where's the "SQL Howlers" article that Gail Shaw wrote about that?)
July 12, 2023 at 7:37 pm
This should give you some idea. With that partitioning, you can no longer have ClaimLineID as a pk (or even a stand-alone unique index/constraint).
If you can use data compression, you almost certainly should for that many rows (i.e., uncomment "DATA_COMPRESSION = "). If you have (max) types in the table, you should consider forcing them to LOB pages (by using sp_tableoption to set 'max value types out of row' to 1).
SET ANSI_NULLS ON;
SET QUOTED_IDENTIFIER ON;
GO
--DROP TABLE dbo.Services;
--DROP PARTITION SCHEME ServiceStartDatePartitionScheme;
--DROP PARTITION FUNCTION ServiceStartDatePartitionFunction;
GO
CREATE PARTITION FUNCTION ServiceStartDatePartitionFunction ( date )
AS RANGE LEFT
FOR VALUES (
'20200101', '20200102', '20200103', '20200104', '20200105', '20200106',
'20200107', '20200108', '20200109', '20200110', '20200111', '20200112',
'20210101', '20210102', '20210103', '20210104', '20210105', '20210106',
'20210107', '20210108', '20210109', '20210110', '20210111', '20210112',
'20220101', '20220102', '20220103', '20220104', '20220105', '20220106',
'20220107', '20220108', '20220109', '20220110', '20220111', '20220112',
'20230101', '20230102', '20230103', '20230104', '20230105', '20230106',
'20230107', '20230108', '20230109', '20230110', '20230111', '20230112' );
GO
CREATE PARTITION SCHEME ServiceStartDatePartitionScheme
AS PARTITION ServiceStartDatePartitionFunction
ALL TO ( [PRIMARY] );
GO
CREATE TABLE dbo.Services (
ServiceStartDate date NOT NULL,
ClaimLineID bigint NOT NULL,
ClaimHeaderID bigint NOT NULL,
ClaimLineAmount money NULL,
INDEX Services__CL UNIQUE CLUSTERED ( ServiceStartDate, ClaimLineID ) WITH ( /*DATA_COMPRESSION = PAGE,*/ FILLFACTOR = 98 ) ON ServiceStartDatePartitionScheme ( ServiceStartDate )
/* , CONSTRAINT Services__PK PRIMARY KEY NONCLUSTERED ( ClaimLineId ) */ );
GO
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 13, 2023 at 1:30 pm
Hello and thanks for your response.
My requirement is for a specific field not a table. They are asking for a partition key on the field ServiceStartDate. I'm very ignorant to partition keys and it is very new to me so I don't know why it is requesting this specifically so I'm not in a place to rebuttal why it may or may not be a good idea.
The reading up on it via googling it; it appears be like indexing that field but better for very large tables but maybe its not and I'm just trying to learn it so I can implement it.
July 13, 2023 at 2:11 pm
Hi Scott,
Thanks for your response. I am using your sample to learn this and am working through setting this up in our test environment. A few questions.
Thank you so much for all your help!
Strick
July 13, 2023 at 3:06 pm
Clustering by the date will solve the performance issues. Partitioning will aid performance only if you need to rebuild the table, in which case you would only need to rebuild the current partition rather than the whole table. Similarly, theoretically you could back up only the current partition and not all the old partitions that aren't changing, thus also reducing overhead somewhat for backups.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 13, 2023 at 5:33 pm
Gail's piece: https://www.red-gate.com/simple-talk/databases/sql-server/database-administration-sql-server/gail-shaws-sql-server-howlers/
You can't partition a column. This is for tables. A client asking for this doesn't really understand what they want.
July 13, 2023 at 7:34 pm
By far the best performance payoff for this is creating the best clustered index on the table, as coded below. You would likely not even need to partition then.
Based on the PK comment you made, I'm assuming you have a clustered PK on an identity column. Which is a so-called "standard practice", but that is actually the biggest myth in table design. Defaulting to a clustered identity is usually WRONG, WRONG, **SO WRONG**.
Here are the general commands to adjust the indexes, if my guess above about the PK is correct.
ALTER TABLE DROP CONSTRAINT [PK_yourtablename] ON dbo.yourtablename;
CREATE UNIQUE CLUSTERED INDEX yourtablename__CL ( ServiceStartDate, ClaimLineID ) ON dbo.yourtablename WITH ( /*DATA_COMPRESSION = PAGE,*/ FILLFACTOR = 98, SORT_IN_TEMPDB = ON );
ALTER TABLE ADD CONSTRAINT yourtablename__PK PRIMARY KEY ( ClaimLineID ) WITH ( /*DATA_COMPRESSION = ROW,*/ FILLFACTOR = 99, SORT_IN_TEMPDB = ON );
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 13, 2023 at 9:04 pm
Hi Scott,
Sorry, actually there was a typo. I can see how the typo would create your assumption. This is a Claim Line table so the ClaimHeaderID and ClaimLineID as actually a composite PK.
ClaimHeaderID is still a FK back to a ClaimHeader table. So if should have read like this:
ClaimHeaderID (PK FK bigint)
ClaimLineID (PK bigint)
ClaimLineAmount (money)
ServiceStartDate (date)
As far as whether I should be using index vs the partition request I'm inclined to agree with you since I've worked with indexes on dates in large tables my whole career and haven't ever come across where someone wanted a partition key. Unfortunately I'm kinda stuck implementing the requirement. So I'm inclined to figure out how to implement a partition key per the request.
July 13, 2023 at 9:06 pm
There is no way to implement the request. There is no partition key on a non partitioned table
July 13, 2023 at 9:47 pm
But you're going to have to include the date in the clustering key to do partitioning anyway, you might as well change the clustering first. And that will be less work than having to partition the table too.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 13, 2023 at 9:47 pm
Hi Steve,
So even if though the index approach should be the way I go, why wouldn't Scott's earlier code not work for me if I at least implement it that way?
July 13, 2023 at 9:52 pm
You very likely already have a clustered index on the table, on ( ClaimHeaderID, ClaimLineID ). (That's a vastly better clus key than an identity column, btw.)
You can only have one clustered index on a table. So you will have to drop the current index; create the new clustered index (on ServiceStartDate, ClaimHeaderID, ClaimLineID); finally recreate the PK as nonclustered.
Best way to do that: (1) script out the CREATE for the existing index, but change it to NONCLUSTERED; (2) drop the index; (3) create the new clus index; recreate the old index as nonclus.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply