Query optimization help

  • Hi Experts,

    Need some help in tuning specific query.

    We have a query which is running for 5-10 mins. Usually it takes few seconds to complete. After re-running for multiple times, it was running 5-7 secs. It was a HEAP table. Found fragmentation of 88% on this HEAP. Its a table which is created by 3rd party tool. I am attaching query, xml plan, .sqlplan captured in management studio and set statistics io output.

    Please suggest tuning tips for this specific query.

    Thanks,

    Sam

    Attachments:
    You must be logged in to view attached files.
  • Hi Sam,

    It doesn't look like there's much room to maneuver here. I don't have experience with heap fragmentation, and would love to hear if defragmenting the table solves the performance issue. That does sound promising.

    Here's my thoughts:

    This table has a lot of columns. If you don't need to get all columns, explicitly state which ones you need to reduce the time taken to return data to the client application.

    This is a heap, there's no unique constraints, and you're asking for all columns, so SQL will need to scan all rows and retrieve all columns. I don't think a non-clustered index to support the WHERE predicate would help.

    You'd be able to do a lot more to improve this specific query if the table had a clustered index, but it's from a third-party tool, so I don't think that's available to you. If you were feeling ambitious and there is a support plan in place, you could try emailing the third party and requesting if you can add a clustered index to that table. There's a lot of risk there, even if they do let you, but that would be the best approach for the read performance on this query.

    You could add a non-clustered index on the columns (PKEY_sRC_OBJECT, ROWID_SYSTEM, HIST_CREATE_DATE) and INCLUDE(everything else), but you are changing the table schema in doing this, and if the vendor/third party tries to drop or alter a column referenced in your non-clustered index, they'll get an error stating the index depends on it. That's also a lot of data in that table, so this may cause performance issues on INSERT/UPDATE/DELETEs.

    Your sort doesn't look to be affecting this significantly; that's done with the rows retrieved from the table scan, 71 here.

    It takes 5-7 seconds to run after a few tries, but the first time this is ran it takes 5-10 minutes. All the reads shown in your STATISTICS IO plan are logical (read from memory, not disk). Based on this, I'm guessing SQL is reading all of the required data for this into cache. If you wanted to reduce that 5-10 minute wait, you could consider whether it is possible to improve disk performance, or increase the RAM. You could support this request by verifying the relevant wait types are occurring while the query is executing.

    Does this query need to return live data, or can it be delayed? You could consider creating a copy of this table with a clustered primary key, copying the rows updated since the last HIST_CREATE_DATE into that table on schedule, and querying that copy.

    Best,

    Andrew

    • This reply was modified 5 years, 3 months ago by  Andrew P.
  • My first observation is that there is no index to support this query even if the "table" was in good shape.  You could try the following indexes to see if one of them helps...

     CREATE INDEX By_PKEY_sRC_OBJECT_Composite01
    ON dbo.C_B_ACCOUNT_HXRF (PKEY_sRC_OBJECT)
    WHERE ROWID_SYSTEM='SAP'
    ;
    CREATE INDEX By_PKEY_sRC_OBJECT_Composite02
    ON dbo.C_B_ACCOUNT_HXRF (PKEY_sRC_OBJECT)
    INCLUDE (HIST_CREATE_DATE)
    WHERE ROWID_SYSTEM='SAP'
    ;

    Not sure either will help but the answer is always "No" unless you try but the (*) may prevent its use.  You might try an index usage hint (imperative) to go along with it.  Not normally a great idea but sometimes you've gotta do what you've gotta do with 3rd party junk like this "table".

    According to the statistics output you provided, the table has 1,589,904 pages.  That translates to 12.4 GB (1,589,904/128 = 12,421MB) for the total "table" size (since this was a scan).  The XML also states that the average row width is 28,639 bytes and not all of that will be "Row Overflow" (out of row simto a large LOB).  So there's probably a bit of 449 thousand rows in this table but I'm not sure because I don't actually know the page density of the Heap pages.

    Obviously, if the indexes don't actually help, you should drop them after testing.

    I can't make a recommendation for what a decent Clustered Index might be to prevent Page Density or Fragmentation problems because I don't know how this table is actually used.  For example, based on the name of the leading column (HIST_CREATE_DATE) in the Heap and the presence of certain start and end date type columns, it looks like this "table" should be an "INSERT/APPEND ONLY" type of table.

    With that in mind, can you verify if it is an "INSERT/APPEND ONLY" type of history table or do UPDATEs occur on this table?  If there are UPDATEs on the table, are they "ExpAnsive" on any of the the variable width columns?

    Also and since it's a 3rd party "table", are we allowed to add a Clustered Index and, perhaps, force some of the "Row Overflow" junk out of row so that queries that must scan will run a whole lot faster?

    BTW... you did a freakin' awesome job on providing the contents of the ZIP file.  It's the most complete post I've seen in the last 2 decades.  I wish everyone would get into the habit of doing that when they need help with a performance problem.  It saves on asking a bazillion questions.  Very well done, Sam!

    Just as a suggestion, if you add the page_count, avg_page_space_used_in_percent, record_count, avg_record_size_in_bytes, and (for Heaps) forwarded_record_count to the "fragmentation" graphic in the future, that would also help.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff.

  • Thanks Andrew. I will work with the developer.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply