Tuning CPU intensive remote query

  • TheSQLGuru (5/3/2015)


    sql-lover (5/3/2015)


    TheSQLGuru (5/1/2015)


    sql-lover (5/1/2015)


    TheSQLGuru (5/1/2015)


    I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.

    Thanks for reply.

    Maybe I am not following you, but the query is already inside an sproc. The sproc is called on Server2, then is executed, connect to Server1 (where the table resides) and give the result back. Are you saying that the select statement should reside on Server1? If that's the case, that's another no, no. This is actually a major, major code redesign and simply is not possible.

    The code should reside, and execute, on the server where the table is. Can't see why it is a "major, major code redesign" to simply call a sproc that inside itself calls a sproc on a different server. That isn't a change to the calling mechanism at all. But if you can't do it for some reason then you can't.

    Thousands of store procedures to change, lot of dependencies. It is a major change. I know the app, you don't.

    I don't care about your app or it's thousands of sprocs, nor do I need to know a thing about it other than it calls a sproc that does a remote server SELECT. I was addressing your singular request to address performance in that single sproc. And my solution CAN be done without a single change to anything other than the inside of the called sproc and making a simple sproc on the remote server that is called by that initial sproc. My solution carries a high likelyhood of success - I have seen it over and over in almost 20 years of working with SQL Server. No redesign of any form required.

    Now, if you have tons of other cases where have this same execution mechanism those can also be addressed on an as-needed basis, again without any change to the underlying calling app.

    Signing off before we get too contentious. I truly wish you the best of luck in solving your performance problem. 🙂

    And if you don't care about the application design and what I stated or explained on my thread, then why you answered? You seem to get easily offended when people disagree with you. Not the 1st time though (with others) and not the 1st time with me either. So what about you considering "grey" instead or black or white, as suggestions to other people?

    I am telling you that based on the app design, which I know and you don't (not trying to be offensive, just putting it as a fact), requires lot of code change, all over the place. I'm the DBA and I support the app, you don't.

    Also, if you read my previous post, I stated that I am looking (if possible) for a suggestion that requires no TSQL code change. Yours, requires a TSQL code change.

    Granted, It may be that a TSQL code change will fix the issue for good (majority of performance problems in MSSQL are TSQL code or faster to improve via code changes), but I am not interested on such solution because my management team won't approve it, so I am basically wasting my time in pursuing those. What I am looking though, is an Index suggestion, even linked Server tweaks (like mentioned before), which require no TSQL code change.

    So thanks, but your suggestion does require a lot of code change and it's not the best solution for me.

  • There really isn't a way to fix this problem without a code change. The issue is that ALL the data is being brought across the pipe to filter down to a top 1. The BEST solution is to have the TOP executed on the remote server which means either creating a stored procedure on the remote server or using OPENQUERY instead of 4-part naming. Other possible solutions that might be faster are:

    1. Remove the TOP from query, put the results in a temp table and then query the temp table for top 1.

    2. Try a CTE based solution using ROW_NUMBER(Order by METRIC_DATE DESC). I don't think this will be faster, but it might be worth a try.

    The closest you can come without a code change, and I'd consider index changes code changes, is to create a narrow covering index for this specific query, which you have already sort of tried. I'd create this index:

    Create Index IX_table_cols On Table(col4, Col1, MetricDate DESC) Include (col2, Col3)

    The key differences in the index I'm suggesting are:

    1. Your new index doesn't have Col1 in the key and you are using that in your WHERE clause, so you should need a lookup and then a filter. I might even make this the leading column if it is more selective than col4, which I think it might be.

    2. You new index doesn't have MetricDate DESC so the query still has to read the entire index to satisfy the TOP.

  • Jack Corbett (5/4/2015)


    There really isn't a way to fix this problem without a code change. The issue is that ALL the data is being brought across the pipe to filter down to a top 1. The BEST solution is to have the TOP executed on the remote server which means either creating a stored procedure on the remote server or using OPENQUERY instead of 4-part naming. Other possible solutions that might be faster are:

    1. Remove the TOP from query, put the results in a temp table and then query the temp table for top 1.

    2. Try a CTE based solution using ROW_NUMBER(Order by METRIC_DATE DESC). I don't think this will be faster, but it might be worth a try.

    The closest you can come without a code change, and I'd consider index changes code changes, is to create a narrow covering index for this specific query, which you have already sort of tried. I'd create this index:

    Create Index IX_table_cols On Table(col4, Col1, MetricDate DESC) Include (col2, Col3)

    The key differences in the index I'm suggesting are:

    1. Your new index doesn't have Col1 in the key and you are using that in your WHERE clause, so you should need a lookup and then a filter. I might even make this the leading column if it is more selective than col4, which I think it might be.

    2. You new index doesn't have MetricDate DESC so the query still has to read the entire index to satisfy the TOP.

    Thanks

    I'm about to deploy the new Indexes this week (final ones). After lot of testing I'm close to pick any of these two, which is basically what are you suggesting.

    #1

    Key on Col4 and MetricDate with the other two columns as included columns

    or

    Option #2

    Key on Col4 a with the other two columns as included columns

    Option#1 keeps the TOP operator at zero cost. While #2 increases its cost to 57%. What's good , which is obvious as this is a very simple query, on option #2 the Index Seek cost goes down to 43%. So I may pick this 2nd option as final Index. If the actual Index seek cost is really low, I am assuming that the impact of the sort operator will be minimum compared to keep it low, but passing that "saving" to the Index seek (Option #1)

  • Sorry I didn't get to this sooner, but here is an untested shot of what you need.

    I am posting it as an attachment as I seem to be having issues posting it directly in this post.

  • sql-lover (5/1/2015)


    Hi,

    I am tuning the following CPU intensive query, which is also a remote query for a given store procedure:

    INSERT INTO #t

    SELECT top 1 @DOB, col2, col3

    FROM Server1.MyDatabase.dbo.MyTable

    WHERE col4 = @ZipCode

    AND col1 = @DOB

    ORDER BY MetricDate DESC

    Below is the execution plan for that particular TSQL statement inside the store procedure:

    And below is the "local" plan, when running that piece alone, in the local server, which would be Server1:

    Now, there is a nasty Index on MyTable that I am trying to improve; it is basically how I found about this particular store procedure, while checking that Index. It is a very fat one, almost all columns are part of the Index. This is a leftover or old work when we did not have a DBA.

    Anyway...the keys for that fat Index are:

    col4, col1, MetricDate , col2, col3 (that order) ... with included columns on col5, col6 (which are not being used on this particular statement). And there is also a Cluster Index on col1.

    I managed to change the Index on my lab to:

    col4, MetricDate with col2 and col3 as Included columns. Below is the new execution plan if that query is run locally:

    My question is ... what else can I do (without altering the SQL code) to reduce CPU utilization? It seems that the new Index helps, but now the TOP operator cost went up, although, it seems to run fast enough, less than a second locally.

    When running remotely or inside the actual sproc, The avg execution time is around 9 seconds but I've seen clients at 20. And CPU cost is about 90% of more of that. It does not sound like much, but this query/store procedure is common on hundreds of clients, which shared same schema. It runs or hits MyTable which is kind of our central database which keeps metadata for the clients.

    All SQL boxes run on VMware and reside on same subnet.

    I am curious. What are the relative costs of those index seeks? Just gonna skip past that remote query for now because you can't do much with that anyway.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • you keep dropping col1 from your index for some reason..

    another workaround for ineffective TOP will be rewriting queries as

    Select ... from Server1.MyDatabase.LinkedServer.MyTable (instead of dbo schema), and creating views in that schema, which filter top rows only

  • SQLRNNR (5/4/2015)


    I am curious. What are the relative costs of those index seeks? Just gonna skip past that remote query for now because you can't do much with that anyway.

    So funny... I Google for a proper way to rename (or not to) sprocs, and found an article from you, lol ... about effect of sp_rename on system catalog. Different issue though.

    Anyway, original plan the cost is 100% of the plan (for that particular statement) . With the new Index, it dropped to 40% or something , passing the remaining cost to the TOP operator.

    By the way, I can't remove the TOP because the query's logic. I need the most recent records after getting the set. The TOP accomplishes that.

  • sql-lover (5/4/2015)


    SQLRNNR (5/4/2015)


    I am curious. What are the relative costs of those index seeks? Just gonna skip past that remote query for now because you can't do much with that anyway.

    So funny... I Google for a proper way to rename (or not to) sprocs, and found an article from you, lol ... about effect of sp_rename on system catalog. Different issue though.

    Anyway, original plan the cost is 100% of the plan (for that particular statement) . With the new Index, it dropped to 40% or something , passing the remaining cost to the TOP operator.

    By the way, I can't remove the TOP because the query's logic. I need the most recent records after getting the set. The TOP accomplishes that.

    What I was looking for was the cost of the operators (and the entire plan) from before and after rather than the relative cost (or percent). That way I could compare before and after costs to see the impact.

    If you hover over the operator it will give you a numeric cost. This is useful for comparing before and after tuning - but just on the server in question because those values could mean something entirely different on a different box.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply