Performance Risk\Impact of Partitioning

  • ScottPletcher (11/25/2015)


    I still believe the best way to improve performance is to remove inactive rows from the main table/partition. If needed, a partitioned view can be used to see both active and deleted rows. Sergiy prefers to put the active flag in the clustering key which is insane and unnecessary overhead. A few sample, stand-alone queries don't demonstrate the true idiocy of that approach

    You're free to believe in what you want to believe.

    But it's your religious beliefs, nothing to do with reality.

    Until you can provide a scientific proof for it.

    Idiocy is to insist on an approach having no confirmation of its validity.

    Or dismissing an approach having a proof of its validity right in front of your eyes.

    No much difference from other kind of religious fanatism.

    You realize the folly when multiple users start hitting it at the same time and the endless deadlocks start. Then you're forced into RCSI and its associated overhead, which on very busy tables can be huge. RCSI itself is useful, but it shouldn't be a forced fallback because of poor index design.

    I do not need to realise anything.

    As I said - I'm one of those users hitting a 108 mil rows transactional table with queries.

    Other 20 users in Accounting do the same on a regular basis.

    How many online users read and update data in it every minute - hard to tell.

    So far - no complaints. All queries completed in timely manner.

    That's a fact.

    _____________
    Code for TallyGenerator

  • Welsh Corgi (11/25/2015)


    I thought that since I was partition by BeginDate I had to include that in the Clustered Index?

    If you think about partitioning by [BeginDate] you should have at least an index on it.

    For a start...

    It's better to have it as a clustered index, but it's only if you're closed for the long weekend and you can afford a downtime for reconstructing the table with all its indexes.

    For now - create an simple 1 column non-clustered index on [BeginDate] and try these 2 queries:

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT TOP 2000 * from dbo.xactControlPoint T (NOLOCK)

    WHERE beginDate BETWEEN '2014-11-01' AND '2014-12-01' -- = '45409114'

    SELECT TOP 2000 *

    FROM dbo.xactControlPoint T

    INNER JOIN (

    SELECT MIN(xactControlPointID) FromID, MAX(xactControlPointID) ToID

    FROM dbo.xactControlPoint WITH (INDEX ([IX_xactControlPoint_beginDate]))

    WHERE beginDate BETWEEN '2014-11-01' AND '2014-12-01'

    ) F ON T.xactControlPointID BETWEEN F.FromID AND F.ToID

    WHERE beginDate BETWEEN '2014-11-01' AND '2014-12-01'

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    I used (nolock) on the first query not to put you into a trouble for making the table inaccessible for others.

    You surely won't need (nolock) for the second one.

    _____________
    Code for TallyGenerator

  • Thank you very much for your help.

    I'm looking to partition by year.

    My current archive code looks up the year in a lookup table.

    I want to mimic that logic in a partitioning scenario.

    Thanks again.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Try the query.

    Use it as a pattern for actual queries in your stored procedures.

    There is a good chance that after seeing the actual performance of ckustering the word "partition" would be considered inappropriate in your work environment.

    _____________
    Code for TallyGenerator

  • I'm sorry, which query?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • 2 posts above

    _____________
    Code for TallyGenerator

  • Sergiy (11/25/2015)


    Welsh Corgi (11/25/2015)


    The following is the structure of the table:

    CREATE TABLE [dbo].[xactControlPoint](

    [xactControlPointID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,

    [xactControlPointTypeID] [int] NOT NULL,

    [call_id] [int] NOT NULL,

    [contr_id] [int] NULL,

    [xactTransactionIDValue] [varchar](50) NULL,

    [beginDate] [datetime] NOT NULL,

    [userName] [varchar](250) NULL,

    [notes] [varchar](2000) NULL,

    [reservationFlg] [bit] NULL,

    [rowUpdateDate] [datetime] NULL,

    [rowUpdateID] [int] NULL,

    [Target_Followup_Date] [datetime] NULL,

    CONSTRAINT [PK_xactControlPoint] PRIMARY KEY CLUSTERED

    (

    [xactControlPointID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [FG1]

    ) ON [FG1]

    The following are the indexes:

    index_nameindex_descriptionindex_keys

    IDX__xactControlPoint__Callid_xactControlPointTypeIDnonclustered located on FG2call_id, xactControlPointTypeID

    IN_CallIDnonclustered located on FG2call_id

    IN_ContrIDnonclustered located on FG2contr_id

    IX_ControlPointTypeIDnonclustered located on FG1xactControlPointTypeID

    IX_xactControlPoint__K3_K4nonclustered located on FG2call_id, contr_id

    PK_xactControlPointclustered, unique, primary key located on FG1xactControlPointID

    Besides adding the begin date to the Primary Key, what other indexes should be created?

    Answering the question about indexes without seeing the most used/typical queries is pretty much shooting in the dark.

    What?? That's ridiculous. The actual stats tell me far more than just looking at raw query code.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Sergiy (11/25/2015)


    Welsh Corgi (11/25/2015)


    Scott,

    Please see attach Excel file.

    Unfortunately I had a hard time interpreting what you were telling me to do.

    Thank you for all of your you help.

    The stats look not so bad, actually.

    On the table you posted you have mostly seeks, with relatively low number of scans.

    The only screaming problem - huge number of lookups.

    Which means - table clustering is not aligned with data range selection in queries .

    Can you post some queries which cause problems?

    You now have a perfect test case to decide between our methods. I think the "analysis" above is extraordinarily superficial and ultimately downright wrong. You can compare to my index recommendations earlier -- which I'll explain further when I can -- and decide for yourself which approach is more accurate.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Welsh Corgi (11/25/2015)


    Scott,

    Please see attach Excel file.

    Unfortunately I had a hard time interpreting what you were telling me to do.

    Thank you for all of your you help.

    Hmm, in the spreadsheet I saw, both sheets contain exactly the same info (specifically the index usage stats, but no index missing stats).

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Welsh Corgi (11/26/2015)


    Thank you very much for your help.

    I'm looking to partition by year.

    My current archive code looks up the year in a lookup table.

    I want to mimic that logic in a partitioning scenario.

    Thanks again.

    Why do you want to partition by year? (begin year?) You never query that way, at least from what I remember of your existing indexes and the missing index stats. You can make your lead clustered key date, of course, but it will be a lot more overhead overall. Since you never query by that, all your reads will have to be come via nonclus indexes. That's definitely not good for overall performance.

    Edit: Again, is it possible that call_id would be a reasonable "pseudo-date" that you could partition on? Call_id would make a very suitable lead key for the clustering index based on the stats posted.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Welsh Corgi (11/25/2015)


    The following example baffles me about Indexes Where it uses a date file in the indexes and it talks about aligned indexes.

    It does not work but it is similar to what I want to do but archive by year.

    http://sqlfool.com/2008/11/102/

    http://sqlfool.com/2008/11/102/">

    http://sqlfool.com/2008/11/102/

    I found https://msdn.microsoft.com/en-us/library/aa964122(SQL.90).aspx (even though old) and http://download.microsoft.com/download/D/B/D/DBDE7972-1EB9-470A-BA18-58849DB3EB3B/PartTableAndIndexStrat.docx to be enlightening (even though much longer to read :). Alignment is covered in more detail by them.

    What http://sqlfool.com/2008/11/102/ does not cover well (IMHO) is the sliding window scenario. The left-most partition of the sliding window needs to be empty, which for [datetime] a useful partitioning value is '1753-01-01 00:00:00.000' (instead of a CHECK constraint).

    When partitioning a table, the primary key's partitioning column can be in *any* ordinal position within the soon-to-be-composite primary key. If the partitioning column is used by no query, put the partitioning column in the *last* ordinal position. Whether indexes should also be aligned to a partition depends upon conditions broadly described by https://msdn.microsoft.com/en-us/library/ms190787.aspx. But ... when indexes are *not* aligned, the time spent switching the clustered index (table) to an archive partition will include time spent ghosting rows|pages|allocation pages. That additional time is rooted an IO cost - it will introduce an exclusive lock durations (on non-aligned indexes) *during* a partition switch. But if all indexes are aligned, a switch involves nothing more than a brief elevations Schema Modification locks (during which members of sys.partitions are updated). Because the partitioning column must be a key in the primary key, foreign keys also need to be aligned (for fastest switches), unless you can afford exclusive lock time upon the children (to perform a traditional delete/truncate). Even if pre-deleting is possible the FK will need to be checked during the switch. If the children cannot be quickly (painlessly) pre-deleted, foreign keys will have to be disabled and re-enabled during the switch (with the foreign keys being switched at the "same" time). If you want the re-enabled foreign keys to be trusted by the optimizer, WITH CHECK CHECK would have to be used, which can be loosely called "a postponed IO cost".

    I considered implementing partition switching for my environments. Its biggest table also concerns transactions, with relationships similar to those within xactControlPoint. XactControlPoint appears to be more normalized than my table (I am jealous:)). What turned me away from partition switching was the need to manage foreign keys. I was not very concerned about the performance impact that of slightly (at best) increased index depths (due to addition of partitioning column) and somewhat larger index sizes (due to a wider clustered index key). But I was concerned with the performance impact of pre-deleting children, or, the impact of waiting for FK checks during a switch. My foreign key concerns are also rooted in my not being in control of FK/PK references (the database is continuously being updated by ~100 developers).

    My consideration of sliding windows partitioning was not driven by a need to increase delete performance or improve query performance. It was instead driven by my need to not significantly impact *existing* performance. And that consideration was dictated by my employer's concern about the cost of leased disk space (sigh). If being driven by a need to increase *current* performance, a B-Tree's depth resilience (to a switch or a delete) is situational == YMMV == test :), but partitioning tables are not likely to be the droids you are looking for:Whistling:.

  • SoHelpMeCodd (11/27/2015)


    Welsh Corgi (11/25/2015)


    The following example baffles me about Indexes Where it uses a date file in the indexes and it talks about aligned indexes.

    It does not work but it is similar to what I want to do but archive by year.

    http://sqlfool.com/2008/11/102/

    http://sqlfool.com/2008/11/102/">

    http://sqlfool.com/2008/11/102/

    I found https://msdn.microsoft.com/en-us/library/aa964122(SQL.90).aspx (even though old) and http://download.microsoft.com/download/D/B/D/DBDE7972-1EB9-470A-BA18-58849DB3EB3B/PartTableAndIndexStrat.docx to be enlightening (even though much longer to read :). Alignment is covered in more detail by them.

    What http://sqlfool.com/2008/11/102/ does not cover well (IMHO) is the sliding window scenario. The left-most partition of the sliding window needs to be empty, which for [datetime] a useful partitioning value is '1753-01-01 00:00:00.000' (instead of a CHECK constraint).

    When partitioning a table, the primary key's partitioning column can be in *any* ordinal position within the soon-to-be-composite primary key. If the partitioning column is used by no query, put the partitioning column in the *last* ordinal position. Whether indexes should also be aligned to a partition depends upon conditions broadly described by https://msdn.microsoft.com/en-us/library/ms190787.aspx. But ... when indexes are *not* aligned, the time spent switching the clustered index (table) to an archive partition will include time spent ghosting rows|pages|allocation pages. That additional time is rooted an IO cost - it will introduce an exclusive lock durations (on non-aligned indexes) *during* a partition switch. But if all indexes are aligned, a switch involves nothing more than a brief elevations Schema Modification locks (during which members of sys.partitions are updated). Because the partitioning column must be a key in the primary key, foreign keys also need to be aligned (for fastest switches), unless you can afford exclusive lock time upon the children (to perform a traditional delete/truncate). Even if pre-deleting is possible the FK will need to be checked during the switch. If the children cannot be quickly (painlessly) pre-deleted, foreign keys will have to be disabled and re-enabled during the switch (with the foreign keys being switched at the "same" time). If you want the re-enabled foreign keys to be trusted by the optimizer, WITH CHECK CHECK would have to be used, which can be loosely called "a postponed IO cost".

    I considered implementing partition switching for my environments. Its biggest table also concerns transactions, with relationships similar to those within xactControlPoint. XactControlPoint appears to be more normalized than my table (I am jealous:)). What turned me away from partition switching was the need to manage foreign keys. I was not very concerned about the performance impact that of slightly (at best) increased index depths (due to addition of partitioning column) and somewhat larger index sizes (due to a wider clustered index key). But I was concerned with the performance impact of pre-deleting children, or, the impact of waiting for FK checks during a switch. My foreign key concerns are also rooted in my not being in control of FK/PK references (the database is continuously being updated by ~100 developers).

    My consideration of sliding windows partitioning was not driven by a need to increase delete performance or improve query performance. It was instead driven by my need to not significantly impact *existing* performance. And that consideration was dictated by my employer's concern about the cost of leased disk space (sigh). If being driven by a need to increase *current* performance, a B-Tree's depth resilience (to a switch or a delete) is situational == YMMV == test :), but partitioning tables are not likely to be the droids you are looking for:Whistling:.

    I am against partitioning in this situation.

    I have archive procedure that is very complex.

    How can I convince my VP that partitioning is not the way to go?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Lots of work:) Easy to show partitioning is far more popular for DW systems than OLTP systems (literature search reveals this). Easy to highlight the pitfalls sprinkled within https://msdn.microsoft.com/en-us/library/ms190787(v=sql.110).aspx. Be ready to run SET STATISTICS IO ON SET STATISTICS TIME ON SET STATISTICS PROFILE ON before running a simple example, both with and without partitioning, using realistic data and one or two realistic statements. Another non-partitioned test would be to simply cluster the partitioning column (as a composite primary key), don't partition, and benchmark it. Excerpt & highlight the contrasting results (including plans' partition filtering). Point out a more complex example that would need to be managed, (how many tables' foreign keys would need to be disabled?). Explain that developers would need to gain partitioning knowledge in order to write efficient statements. Benchmark disabling and enabling foreign keys that reference the transactional table, and explain the risks. Contrast sys.dm_db_index_physical_stats & sys.dm_db_partition_stats, and compare Lock Escalation events.

  • Welsh Corgi (11/27/2015)


    I am against partitioning in this situation.

    I have archive procedure that is very complex.

    How can I convince my VP that partitioning is not the way to go?

    First - you need to answer - the way to go from where and to where?

    Why do you need to go anywhere?

    What is the problem you're trying to rectify by the move?

    When you the problem clearly formulated you meed to collect possible solutions.

    Then you evaluate pros and contras for each of them and choose the most appropriate For the task.

    So, what is the problem(s) which caused the discussion about partitioning?

    If it's unsatisfactory query performance then you need to look into the clustering of the tables.

    In the following topic I showed how correct clustering makes partitioning irrelevant:

    http://www.sqlservercentral.com/Forums/FindPost1738471.aspx

    If it's not (only) performance issue then other factors must be taken into consideration.

    _____________
    Code for TallyGenerator

  • If it helps any, I have personally avoided table partitioning for many years even in Standard edition with very large and wide tables in the hundreds-of-millions. Proper indexing, understanding my users, dedicated I/O, proper placement, etc have helped me excel with great performance with very little resources.

    Now that my tables are getting to 4 billion records, other options needed to be considered, especially with loaded backdated data into a billion record table clustered on time and users making huge reads with their queries. Partitioned views and table partitioning became that option taken to help in those situations, but not just for increased performance.

    However, keep in mind, in my case this is a DW (OLAP) environment and not transactional (OLTP).

Viewing 15 posts - 31 through 45 (of 62 total)

You must be logged in to reply to this topic. Login to reply