How to calculate potential IO benefits of changes

  • That's exactly the point.  There are far too many variables for discussion here.  I haven't specifically asked "how do I tune this query?"  I've been asking "how do I evaluate the benefits of different options without building them?"

    The solution to our problem probably involves process re-engineering rather than tuning a specific query.  But with multiple options to re-engineer how do choose which one will give us the biggest benefit?

  • Jez-448386 wrote:

    This is the query with the plan (names have been changed):

    DECLARE @DateColumn  DATE = '2024-07-01';

    DELETE s
    FROM [Schema].TableName s
    INNER JOIN #WorkingTable q
    ON q.Column1 = s.Column1
    AND q.Column2 = s.Column2
    WHERE s.DateColumn = @DateColumn



    StmtText
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    |--Table Delete(OBJECT:([DBName].[Schema].[TableName] AS ), OBJECT:([DBName].[Schema].[TableName].[UNQ_Schema_TableName_Column1_Key_Column2_Key_DateColumn] AS ))
    |--Sort(DISTINCT ORDER BY:([Bmk1000] ASC))
    |--Nested Loops(Inner Join, OUTER REFERENCES:(
    .[Column1_Key],
    .[Column2_Key]))
    |--Table Scan(OBJECT:([tempdb].[dbo].[#WorkingTable] AS
    ))
    |--Index Seek(OBJECT:([DBName].[Schema].[TableName].[UNQ_Schema_TableName_Column1_Key_Column2_Key_DateColumn] AS ), SEEK:([PtnId1001]=RangePartitionNew([@DateColumn],(1),'2020-01-01','2020-01-02',......

    Schema.TableName is partitioned by date.  There are typically 30 million rows per date partition.

    #WorkingTable contains 500k records. We receive a continual set of records throughout the day (60 million or more) and run batches of 500k.

    We know that this step consumes far more resources than anything else (11% CPU, 29% logical IO, 15% physical IO).  We believe that the way in which we process data from start to finish is flawed and can be improved.  This is just one step of many in that process.

    We have several ideas for how the start to finish process could be re-engineered.  What we don't know, and this is the original question, is how to do we estimate which would give us the biggest benefit?  We don't want to spend 2 weeks or more developing something which doesn't give us much benefit.

    Thanks

    I need the graphical version of the Actual Execution Plan, please... especially after seeing the index information for the date.  I could be wrong but it doesn't look like it's doing any partition elimination like it should.

    Just a hunch, though... add an OPTION(RECOMPILE) to the DELETE code that you provided and let's see what happens.

    And, I still need the execution plan code in a file so that I can call it graphically and check the gazintas.  Providing just a graphic won't do.  See the article at the second line in my signature line below to find out how to do that if you don't know how.

     

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You're right—there are many variables to consider when evaluating different options without directly implementing them. Instead of focusing on tuning a single query, the approach should involve a broader process re-engineering perspective. To decide on the most beneficial option, consider using performance modeling, scenario analysis, or prototyping with a smaller dataset. This way, you can compare potential outcomes and make an informed decision before committing to full-scale implementation.

  • Jeff Moden wrote:

    Jez-448386 wrote:

    This is the query with the plan (names have been changed):

    DECLARE @DateColumn  DATE = '2024-07-01';

    DELETE s
    FROM [Schema].TableName s
    INNER JOIN #WorkingTable q
    ON q.Column1 = s.Column1
    AND q.Column2 = s.Column2
    WHERE s.DateColumn = @DateColumn



    StmtText
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    |--Table Delete(OBJECT:([DBName].[Schema].[TableName] AS ), OBJECT:([DBName].[Schema].[TableName].[UNQ_Schema_TableName_Column1_Key_Column2_Key_DateColumn] AS ))
    |--Sort(DISTINCT ORDER BY:([Bmk1000] ASC))
    |--Nested Loops(Inner Join, OUTER REFERENCES:(
    .[Column1_Key],
    .[Column2_Key]))
    |--Table Scan(OBJECT:([tempdb].[dbo].[#WorkingTable] AS
    ))
    |--Index Seek(OBJECT:([DBName].[Schema].[TableName].[UNQ_Schema_TableName_Column1_Key_Column2_Key_DateColumn] AS ), SEEK:([PtnId1001]=RangePartitionNew([@DateColumn],(1),'2020-01-01','2020-01-02',......

    Schema.TableName is partitioned by date.  There are typically 30 million rows per date partition.

    #WorkingTable contains 500k records. We receive a continual set of records throughout the day (60 million or more) and run batches of 500k.

    We know that this step consumes far more resources than anything else (11% CPU, 29% logical IO, 15% physical IO).  We believe that the way in which we process data from start to finish is flawed and can be improved.  This is just one step of many in that process.

    We have several ideas for how the start to finish process could be re-engineered.  What we don't know, and this is the original question, is how to do we estimate which would give us the biggest benefit?  We don't want to spend 2 weeks or more developing something which doesn't give us much benefit.

    Thanks

    I need the graphical version of the Actual Execution Plan, please... especially after seeing the index information for the date.  I could be wrong but it doesn't look like it's doing any partition elimination like it should.

    Just a hunch, though... add an OPTION(RECOMPILE) to the DELETE code that you provided and let's see what happens.

    And, I still need the execution plan code in a file so that I can call it graphically and check the gazintas.  Providing just a graphic won't do.  See the article at the second line in my signature line below to find out how to do that if you don't know how.

    it is doing partition elimination -- "SEEK:([PtnId1001]=RangePartitionNew([@DateColumn],(1),'2020-01-01','2020-01-02',... " but I agree graphical is what we need for full info.

  • Attached is the SQL query plan.

    There are two main queries.  The first uses a date parameter and the second doesn't.  The second query is much higher cost that the first showing that it is using partition elimination.

    • This reply was modified 4 months, 1 week ago by  Jez-448386.
  • @pitterson.  We are looking at re-engineering the process but are trying to estimate the benefits.

    What do you mean by performance modeling and scenario analysis?  What tools and techniques do you use?

    We are prototyping a couple of ideas both of which could take a week to complete.  And there are several more ideas that we could investigate.  What we are trying to do is to estimate what are the benefits of each idea so we can focus on those.

  • Jez-448386 wrote:

    Attached is the SQL query plan.

    There are two main queries.  The first uses a date parameter and the second doesn't.  The second query is much higher cost that the first showing that it is using partition elimination.

    no attachement - you should use https://www.brentozar.com/pastetheplan/ and give us the link to it.

  • frederico_fonseca wrote:

    I know  you are after how to estimate X - that is not really a good way to go as there are way to many unknowns, and without looking at the ACTUAL explain plans (graphical ones, not the text you gave us above) you won't find out where the big issue are and where biggest changes can be achieved, potentially with minor code changes.

    but updates slower than deletes and both are a lot slower than inserts, specially if inserts can be bulklogged so approaches that benefit inserts with a combination of truncate/table switch are always first in my mind - even if not always possible or beneficial.

    regarding merge - for volumes I avoid it - see this article, and read well the articles mentioned at the end of it - https://www.mssqltips.com/sqlservertip/7590/sql-merge-performance-vs-insert-update-delete/%5B/quote%5D

    On the subject of Updates being slower than Deletes, I have to say "It Depends" on how many indexes you have and how many dependencies there are on the table.  I have seen it where trying to Delete a single row from a table took nearly 6 minutes while updating a thousand rows was sub-second.

    On this "which will be fast thing", the answer is there's really no way to predict it.  You have to actually measure something.  So, try to stand up a copy of the given scenario and test it.  If you can't do that, then build some test tables with a shedload of rows and test that.  If you don't think you need to do that, see my 6 minutes v.s. sub-second example above.

    And we still need the Actual Execution plan for the Delete.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • posting the query plan as a zip.

    Also: https://www.brentozar.com/pastetheplan/?id=r12r1TEo0

     

    Attachments:
    You must be logged in to view attached files.
  • And deleting 500k records takes ~15 seconds which I don't think is too bad.  Inserts are similar

    We have three working ideas:

    • reduce the number of rows going through the process
    • target the deletes to specific partitions
    • re-engineer to process whole days of records at once rather than records as they arrive

    The first two will still run the delete statement but to a lesser degree.  The third one could eliminate it altogether as we could partition switch into the big table but this requires lots of other work to partition align other tables and maintain them if records change date (which they can).

    Confidentiality prevents me from sharing anything more than this.

    So how do we judge what the benefit of these three options are?  Without building them.  Gut instinct says that the third option is the best but it is the highest cost and highest risk.  Is gut instinct the only way to choose?

     

     

  • @pitterson's response looked like it was ChatGPT-generated (especially given it was  a new user)

  • Jez-448386 wrote:

    posting the query plan as a zip.

    Also: https://www.brentozar.com/pastetheplan/?id=r12r1TEo0

    Neither of those are an "Actual" execution plan.  They're both "Estimated".  While that's close, it misses certain nuances.  It does appear that partition elimination is working for the first delete.  It also doesn't appear to have the problem that I thought it may have, which is other tables referencing the table being deleted from without an index on the referencing table column.

    Jez-448386 wrote:

    So how do we judge what the benefit of these three options are? Without building them. Gut instinct says that the third option is the best but it is the highest cost and highest risk. Is gut instinct the only way to choose?

    Perhaps a different "gut instinct" might be applicable here, especially since your partitions are not static for any given day...

    Since you're not actually taking advantage of partitioning because of the changing dates and the partitioning is done by date, I'd be tempted to experiment with a 4th possibility.  Turn the table back into a monolith and put the CI on Col1 and Col2 which, according to the name of the index you have on those columns, forms a unique set of rows for those two columns.

    And, in case you're scratching your head there, it's a pretty well know fact that partitioning isn't done for code performance... it's done for being able to do a piecemeal restore (doesn't look like you're setup for that, though), save on backups of static data by partition (you change dates on the data in an unpredictable fashion so not static), and partition elimination during queries is still slower the proper monolithic indexing.

    Shifting gears a big, what is Col1 based on?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The next step of the process aggregates the data by date hence partitioning by date.  That runs 18 - 20 times per day.  We also purge data after 3-4 months which we do by truncating the old partition.

    If we remove the partitions and cluster on COL1 it will have an adverse effect of the aggregation and the purge.  How do we estimate the IO impact of this?

    We have lots of ideas - we just don't know how to quantify the impact.

    Also, having one large table will cause us problems with our weekly index maintenance jobs.  Our database is in a HA group and the index maintenance job (rebuilds only no reorganizations) regularly fails with transaction log full messages.  That's a different problem.

    Col1 is BIGINT which is a surrogate key of a VARCHAR(255) column containing a transaction identifier.

    We have an idea to create a non partition aligned index on Col1 to help optimize the delete.  This will keep the date partition to support the aggregation step but we will have to purge the rows in batches rather than truncating a partition.  Again, without building it, how do we estimate the benefits so that we can compare that with the benefit of other ideas.

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply