Partitining an existing table

Question

Partitining an existing table

Viewing 15 posts - 16 through 30 (of 46 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004434 More actions · Answer 1

I read the article, probably I dont have a high IQ, could you please let me know what I've done wrong and what I should do?

Reading just one article probably won't get it for such an advanced topic. And, it has nothing to do with a high IQ or not... just lot's of reading and experimentation... for example...

Hooman Shamsborhan (12/30/2008)
but the fact is that there is no need for channel to be a part of the PK.

... is what you say... but Books Online clearly states...

[font="Arial Black"]When partitioning a unique index (clustered or nonclustered), the partitioning column must be chosen from among those used in the unique index key. [/font]

Isn't a "Primary Key" the ultimate in "Unique Indexes". 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Hooman Shamsborhan Old Hand Points: 396 More actions · Answer 2

[font="Verdana"]

Thanks. This is what I did finally, I created a table and make a PK with 2 PKs it doesnt affect the uniquness of the data the only thing is that make the size of the table much bigger to have another clustered index, there are 100 million records at the most in that table with 80 fields you can imagine that another PK means what!!!!any way I did it.

Now another problem, I got some issues with ALTER TABLE ... SWITCH could you please have a look at this thread as well and guide me in that regards as well?

http://www.sqlservercentral.com/Forums/Topic627801-146-1.aspx?Update=1

[/font]

Jeff Moden SSC Guru Points: 1004434 More actions · Answer 3

I'm not sure what you're talking about. A clustered index IS the data and adds almost nothing to the size of the table.

My feeling is that you've not done the necessary research by reading ALL of the information on the subject in Books Online. Until you've actually done that and tested the methods on a smaller test table that you can actually afford to lose, my recommendation would be to stop messing with production data until you do.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Johan Bijnens SSC Guru Points: 135380 More actions · Answer 4

Jeff Moden (1/1/2009)
I'm not sure what you're talking about. A clustered index IS the data and adds almost nothing to the size of the table.
My feeling is that you've not done the necessary research by reading ALL of the information on the subject in Books Online. Until you've actually done that and tested the methods on a smaller test table that you can actually afford to lose, my recommendation would be to stop messing with production data until you do.

I second that !

Partitioning can be a huge benifit, but it may not give you the optimal result if designed badly, indexes missplaced,.. !

It is also best know if your outer partitions will contain data at runtime or not ! (support it by constraints!)

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 5

Actually partitioning done incorrectly can CAUSE performance problems. In addition to BOL there are quite a number of web-based resources to guide someone's study of this topic.

Still say the best bet would have been to engage an expert way back when (weeks ago now??). That would have gotten your problem solved optimally as well as gotten you mentored up on the how-tos and where-fors of table partitioning.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

TRACEY-320982 SSChampion Points: 13653 More actions · Answer 6

Is there a benefit of doing a partition on a table to the different filegroups and this still be on the same disk i.e F

I want to visit this soon but i thought partition data you need to move data to a separate disk i.e G

Is there a performance gain doing on same disk.

Cheers

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 7

TRACEY (1/4/2009)
Is there a benefit of doing a partition on a table to the different filegroups and this still be on the same disk i.e F
I want to visit this soon but i thought partition data you need to move data to a separate disk i.e G
Is there a performance gain doing on same disk.
Cheers

You did not give enough information. What, EXACTLY, is disk F?? Is is a single 7200rpm SATA drive inside the server or is it a 100-spindle 15Krpm SCSI dedicated LUN on a high-end fiber channel SAN, or something in between? You can easily overload a single (or few) drives with too many files (disk seek and access times add up to cause a performance degredation). Testing is what will reveal the answer for you. Check out SQLIO.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

TRACEY-320982 SSChampion Points: 13653 More actions · Answer 8

Here the data is on same disk F...so i was wondering if doing partition would infact give any performance gain being on the same DISK.

ALTER DATABASE PartitionTestBIG

ADD FILE

(

NAME = PartitionTestBIG_CBO,

FILENAME = 'F:\SQL SERVER 2005 Data files\PredatorDB\PartitionTestBIG_CBO.ndf',

SIZE = 5MB,

FILEGROWTH = 200MB

) TO FILEGROUP FG_CBO;

GO

ALTER DATABASE PartitionTestBIG

ADD FILE

(

NAME = PartitionTestBIG_CHEQ,

FILENAME = 'F:\SQL SERVER 2005 Data files\PredatorDB\PartitionTestBIG_CHEQ.ndf',

SIZE = 5MB,

FILEGROWTH = 200MB

) TO FILEGROUP FG_CHEQ;

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 9

Tracey, it seems as if you did not read my post carefully. What matters is the number of spindles making up drive F. How much additional IO throughput can that/those spindle(s) provide? Also did you investigate SQLIO, which is the only way you can know DEFINITIVELY if adding more files will cause better or worse performance. We simply cannot answer your question without more information and/or testing.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

matt stockham SSCrazy Eights Points: 9942 More actions · Answer 10

TRACEY (1/4/2009)
Is there a benefit of doing a partition on a table to the different filegroups and this still be on the same disk i.e F
I want to visit this soon but i thought partition data you need to move data to a separate disk i.e G
Is there a performance gain doing on same disk.
Cheers

If you just want a general answer, then it's yes ... and no.

Internally the individual partitions are treated as separate indexes. Queries against multiple partitions may take longer as they now have to scan/seek from multiple indexes. Queries against a single partition may be quicker (even if all the filegroups/files are on the same physical drive) as they are pulling from a smaller set of data.

It all depends on your data and how it is accessed. I wouldn't recommend implementing it without a thorough understanding of how partitioning works, why it can provide performance benefits in some circumstances, and how the data is referenced. Better to spend the time on improving badly written queries and ensuring appropriate maintained indexes.

pbowman-543344 Mr or Mrs. 500 Points: 556 More actions · Answer 11

Ooohkaaay. I think I just ran into the scenario where the wrong partitioning can destroy your performance. Shame the documentation isn't a little more explicit on what NOT to do partitioning wise.

I have a family of tables that I need to partition. They represent "message" data, i.e. records that once they are received & processed, have served their purpose and are then historical record. There is a parent NotificationSequence table and then a child (DataTable)Notification table for each data table updated via Notifications - the child tables include a FK to the NotificationSequence table. All tables have bigint IDs as primary keys.

I partitioned by range on NotificationSequenceID...

create partition function PFN_Notification (bigint)

as range right

for values (

1000000000

, 2000000000

, 3000000000

, 4000000000

, 5000000000

, 6000000000

);

go

create partition schema PSH_Notification

as partition PFN_Notification

to (

[FGARCH01]

, [FGARCH01]

, [PRIMARY]

);

go

...then partitioned the parent table...

IF EXISTS (SELECT * FROM dbo.sysindexes

WHERE id = OBJECT_ID(N'[dbo].[NotificationSequence]')

AND name = N'PK128')

ALTER TABLE [dbo].[NotificationSequence]

DROP CONSTRAINT [PK128]

ALTER TABLE [dbo].[NotificationSequence]

ADD CONSTRAINT [PK128] PRIMARY KEY CLUSTERED

(

[NotificationSequenceId] ASC

)WITH FILLFACTOR = 90

ON PSH_Notification ([NotificationSequenceId])

...which took a while, but worked fine. Then to the child table...

IF EXISTS (SELECT *

FROM dbo.sysindexes

WHERE id = OBJECT_ID(N'[dbo].[GBEOrderNotification]')

AND name = N'PK458')

ALTER TABLE [dbo].[GBEOrderNotification] DROP CONSTRAINT [PK458]

ALTER TABLE [dbo].[GBEOrderNotification]

ADD CONSTRAINT [PK458] PRIMARY KEY CLUSTERED

(

[GBEOrderNotificationId] ASC,

[NotificationSequenceId] ASC

) WITH FILLFACTOR = 90

ON PSH_Notification ([NotificationSequenceId])

-- 6:02:11 hrs to run

... Note I had to add in the partition field into the cluster key.

PROBLEM:

select max(GBEOrderNotificationID)

from dbo.GBEOrderNotification

which was instantaneous (as you'd expect), now takes 20 minutes. Whoops!:w00t:

Is this general? Can you only partition on your existing key without destroying performance? Is there a good whitepaper on how NOT to partition (neither BOL, nor Kalen's paper linked above, is really explicit about this)?

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 12

>>Ooohkaaay. I think I just ran into the scenario where the wrong partitioning can destroy your performance. Shame the documentation isn't a little more explicit on what NOT to do partitioning wise.

There are a nearly infinite variety of schemas/queries out there so I don't think you can have documentation on what NOT to do. There are general best practice guidelines out there I bet, but even those will be inappropriate some of the time for certain scenarios.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

matt stockham SSCrazy Eights Points: 9942 More actions · Answer 13

Not sure why it would take 20 minutes, we can look at the execution plan if you want to post it (.sqlplan format please). I can only assume that it relates to having to search through every partition, because the table is partitioned by NotificationSequenceId.

pbowman-543344 Mr or Mrs. 500 Points: 556 More actions · Answer 14

Well the plan is the clustered index scan you'd expect (with a stream aggregate thrown in for partitioning).

However, now the query runs fine. Even after dropcleanbuffers. Yet it was consistently 20 mins at least four times in a row this morning. D'oh! Looks like whatever issue I was having this morning was not related to the partitioning :ermm:. Sorry for wasting your time guys!

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 15

1) did the original table have an index on the field you are doing a max on and if so does that index still exist?

2) if an index doesn't exist on max field then it makes sense it it taking forever to get the max. You have a clustered index with that field and the partitioning field. That would mean a complete table scan of all partitions to get all the values to the max, right? I expect the plan will show a scan of every partition.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service