Tuning CPU intensive remote query

  • Hi,

    I am tuning the following CPU intensive query, which is also a remote query for a given store procedure:

    INSERT INTO #t

    SELECT top 1 @DOB, col2, col3

    FROM Server1.MyDatabase.dbo.MyTable

    WHERE col4 = @ZipCode

    AND col1 = @DOB

    ORDER BY MetricDate DESC

    Below is the execution plan for that particular TSQL statement inside the store procedure:

    And below is the "local" plan, when running that piece alone, in the local server, which would be Server1:

    Now, there is a nasty Index on MyTable that I am trying to improve; it is basically how I found about this particular store procedure, while checking that Index. It is a very fat one, almost all columns are part of the Index. This is a leftover or old work when we did not have a DBA.

    Anyway...the keys for that fat Index are:

    col4, col1, MetricDate , col2, col3 (that order) ... with included columns on col5, col6 (which are not being used on this particular statement). And there is also a Cluster Index on col1.

    I managed to change the Index on my lab to:

    col4, MetricDate with col2 and col3 as Included columns. Below is the new execution plan if that query is run locally:

    My question is ... what else can I do (without altering the SQL code) to reduce CPU utilization? It seems that the new Index helps, but now the TOP operator cost went up, although, it seems to run fast enough, less than a second locally.

    When running remotely or inside the actual sproc, The avg execution time is around 9 seconds but I've seen clients at 20. And CPU cost is about 90% of more of that. It does not sound like much, but this query/store procedure is common on hundreds of clients, which shared same schema. It runs or hits MyTable which is kind of our central database which keeps metadata for the clients.

    All SQL boxes run on VMware and reside on same subnet.

  • What is the security context the linked server is using? Unless that user has pretty elevated permissions it can't access the statistics in order to find the best execution plan (apparently fixed in 2012 SP1). See Tom LaRock's post about Linked Server performance[/url].

  • sql-lover (5/1/2015)


    Hi,

    I am tuning the following CPU intensive query, which is also a remote query for a given store procedure:

    INSERT INTO #t

    SELECT top 1 @DOB, col2, col3

    FROM Server1.MyDatabase.dbo.MyTable

    WHERE col4 = @ZipCode

    AND col1 = @DOB

    ORDER BY MetricDate DESC

    Below is the execution plan for that particular TSQL statement inside the store procedure:

    And below is the "local" plan, when running that piece alone, in the local server, which would be Server1:

    Now, there is a nasty Index on MyTable that I am trying to improve; it is basically how I found about this particular store procedure, while checking that Index. It is a very fat one, almost all columns are part of the Index. This is a leftover or old work when we did not have a DBA.

    Anyway...the keys for that fat Index are:

    col4, col1, MetricDate , col2, col3 (that order) ... with included columns on col5, col6 (which are not being used on this particular statement). And there is also a Cluster Index on col1.

    I managed to change the Index on my lab to:

    col4, MetricDate with col2 and col3 as Included columns. Below is the new execution plan if that query is run locally:

    My question is ... what else can I do (without altering the SQL code) to reduce CPU utilization? It seems that the new Index helps, but now the TOP operator cost went up, although, it seems to run fast enough, less than a second locally.

    When running remotely or inside the actual sproc, The avg execution time is around 9 seconds but I've seen clients at 20. And CPU cost is about 90% of more of that. It does not sound like much, but this query/store procedure is common on hundreds of clients, which shared same schema. It runs or hits MyTable which is kind of our central database which keeps metadata for the clients.

    All SQL boxes run on VMware and reside on same subnet.

    Give the following a try:

    INSERT INTO #t

    SELECT DOB, col2, col3 FROM OPENQUERY([Server1],'

    SELECT top 1 @DOB, col2, col3

    FROM MyDatabase.dbo.MyTable

    WHERE col4 = @ZipCode

    AND col1 = @DOB

    ORDER BY MetricDate DESC')dt(DOB, col2, col3);

  • Jack Corbett (5/1/2015)


    What is the security context the linked server is using? Unless that user has pretty elevated permissions it can't access the statistics in order to find the best execution plan (apparently fixed in 2012 SP1). See Tom LaRock's post about Linked Server performance[/url].

    Scratch that one....

  • Lynn Pettis (5/1/2015)


    Give the following a try:

    INSERT INTO #t

    SELECT DOB, col2, col3 FROM OPENQUERY([Server1],'

    SELECT top 1 @DOB, col2, col3

    FROM MyDatabase.dbo.MyTable

    WHERE col4 = @ZipCode

    AND col1 = @DOB

    ORDER BY MetricDate DESC')dt(DOB, col2, col3);

    Well... that's changing the code, which I am trying to avoid. But I can check on my lab... what about the variables? should I declare those inside or outside?

  • sql-lover (5/1/2015)


    Lynn Pettis (5/1/2015)


    Give the following a try:

    INSERT INTO #t

    SELECT DOB, col2, col3 FROM OPENQUERY([Server1],'

    SELECT top 1 @DOB, col2, col3

    FROM MyDatabase.dbo.MyTable

    WHERE col4 = @ZipCode

    AND col1 = @DOB

    ORDER BY MetricDate DESC')dt(DOB, col2, col3);

    Well... that's changing the code, which I am trying to avoid. But I can check on my lab... what about the variables? should I declare those inside or outside?

    Actually, as is it won't work. I forgot about the variable. It has to be made dynamic and I can't remember how to make this work off the top of my head. I'll need a little time to work it, but it can be done.

    I'll work it when I come back from the lab as I need to test some code I am working on.

  • I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (5/1/2015)


    I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.

    Thanks for reply.

    Maybe I am not following you, but the query is already inside an sproc. The sproc is called on Server2, then is executed, connect to Server1 (where the table resides) and give the result back. Are you saying that the select statement should reside on Server1? If that's the case, that's another no, no. This is actually a major, major code redesign and simply is not possible.

  • Lynn Pettis (5/1/2015)


    sql-lover (5/1/2015)


    Lynn Pettis (5/1/2015)


    Give the following a try:

    INSERT INTO #t

    SELECT DOB, col2, col3 FROM OPENQUERY([Server1],'

    SELECT top 1 @DOB, col2, col3

    FROM MyDatabase.dbo.MyTable

    WHERE col4 = @ZipCode

    AND col1 = @DOB

    ORDER BY MetricDate DESC')dt(DOB, col2, col3);

    Well... that's changing the code, which I am trying to avoid. But I can check on my lab... what about the variables? should I declare those inside or outside?

    Actually, as is it won't work. I forgot about the variable. It has to be made dynamic and I can't remember how to make this work off the top of my head. I'll need a little time to work it, but it can be done.

    I'll work it when I come back from the lab as I need to test some code I am working on.

    I am rusty on that too, maybe this can help me a bit: http://stackoverflow.com/questions/3378496/including-parameters-in-openquery but is another TSQL code change... but I can test it ...

  • I've started to think that is because MyTable is a "hot spot". Hundreds of clients hitting the same table, to get a different set and return that back to Server2, Server3, etc, from Server1, is resource intensive.

  • sql-lover (5/1/2015)


    TheSQLGuru (5/1/2015)


    I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.

    Thanks for reply.

    Maybe I am not following you, but the query is already inside an sproc. The sproc is called on Server2, then is executed, connect to Server1 (where the table resides) and give the result back. Are you saying that the select statement should reside on Server1? If that's the case, that's another no, no. This is actually a major, major code redesign and simply is not possible.

    The code should reside, and execute, on the server where the table is. Can't see why it is a "major, major code redesign" to simply call a sproc that inside itself calls a sproc on a different server. That isn't a change to the calling mechanism at all. But if you can't do it for some reason then you can't.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Have you checked the plan that is being used on Server1 when the query is run remotely like that from Server2?

    If it is taking 20 seconds, you should be able to view the execution plan using sp_whoisactive (simplest way I know - other than if you have your own script) on Server1.

    So many times I have seen a simple query turn into a nightmare just because it involves a linked server, I had one the other day where the "local" plan would have been a quick index seek, but for some reason, when remoted it turned into a full table scan and it wasn't permissions. Fixing it was easy, but did require code changes.

    I think usually the problems come from collation mismatches and/or one or more predicates being off-index, and I can't think of a single one I've had that could be fixed without code or schema changes.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • TheSQLGuru (5/1/2015)


    sql-lover (5/1/2015)


    TheSQLGuru (5/1/2015)


    I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.

    Thanks for reply.

    Maybe I am not following you, but the query is already inside an sproc. The sproc is called on Server2, then is executed, connect to Server1 (where the table resides) and give the result back. Are you saying that the select statement should reside on Server1? If that's the case, that's another no, no. This is actually a major, major code redesign and simply is not possible.

    The code should reside, and execute, on the server where the table is. Can't see why it is a "major, major code redesign" to simply call a sproc that inside itself calls a sproc on a different server. That isn't a change to the calling mechanism at all. But if you can't do it for some reason then you can't.

    Thousands of store procedures to change, lot of dependencies. It is a major change. I know the app, you don't.

  • sql-lover (5/3/2015)


    TheSQLGuru (5/1/2015)


    sql-lover (5/1/2015)


    TheSQLGuru (5/1/2015)


    I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.

    Thanks for reply.

    Maybe I am not following you, but the query is already inside an sproc. The sproc is called on Server2, then is executed, connect to Server1 (where the table resides) and give the result back. Are you saying that the select statement should reside on Server1? If that's the case, that's another no, no. This is actually a major, major code redesign and simply is not possible.

    The code should reside, and execute, on the server where the table is. Can't see why it is a "major, major code redesign" to simply call a sproc that inside itself calls a sproc on a different server. That isn't a change to the calling mechanism at all. But if you can't do it for some reason then you can't.

    Thousands of store procedures to change, lot of dependencies. It is a major change. I know the app, you don't.

    I don't care about your app or it's thousands of sprocs, nor do I need to know a thing about it other than it calls a sproc that does a remote server SELECT. I was addressing your singular request to address performance in that single sproc. And my solution CAN be done without a single change to anything other than the inside of the called sproc and making a simple sproc on the remote server that is called by that initial sproc. My solution carries a high likelyhood of success - I have seen it over and over in almost 20 years of working with SQL Server. No redesign of any form required.

    Now, if you have tons of other cases where have this same execution mechanism those can also be addressed on an as-needed basis, again without any change to the underlying calling app.

    Signing off before we get too contentious. I truly wish you the best of luck in solving your performance problem. 🙂

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • you probably need to rewrite the query and remove "TOP" clause - it executed on local server, not remote, so you pulling entire resultset between boxes.

    simplest way

    Declare @MetricDate smalldatetime

    Select @MetricDate = MAX(MetricDate) from Server1.MyDatabase.dbo.MyTable where col4 = @ZipCol AND col1 = @DOB

    SELECT @DOB, col2, col3

    FROM Server1.MyDatabase.dbo.MyTable

    WHERE col4 = @ZipCode

    AND col1 = @DOB

    AND col5 = @MetricDate

    best way - OPENQUERY per suggestion above

    index should be by col4,col1,MetricDate - your modification misses col1

  • Viewing 15 posts - 1 through 15 (of 22 total)

    You must be logged in to reply to this topic. Login to reply