Tuning CPU intensive remote query

Question

Tuning CPU intensive remote query

sql-lover

SSCoach

Points: 18530
More actions
May 1, 2015 at 12:37 pm

#303586

Hi,
I am tuning the following CPU intensive query, which is also a remote query for a given store procedure:
INSERT INTO #t
SELECT top 1 @DOB, col2, col3
FROM Server1.MyDatabase.dbo.MyTable
WHERE col4 = @ZipCode
AND col1 = @DOB
ORDER BY MetricDate DESC
Below is the execution plan for that particular TSQL statement inside the store procedure:
And below is the "local" plan, when running that piece alone, in the local server, which would be Server1:
Now, there is a nasty Index on MyTable that I am trying to improve; it is basically how I found about this particular store procedure, while checking that Index. It is a very fat one, almost all columns are part of the Index. This is a leftover or old work when we did not have a DBA.
Anyway...the keys for that fat Index are:
col4, col1, MetricDate , col2, col3 (that order) ... with included columns on col5, col6 (which are not being used on this particular statement). And there is also a Cluster Index on col1.
I managed to change the Index on my lab to:
col4, MetricDate with col2 and col3 as Included columns. Below is the new execution plan if that query is run locally:
My question is ... what else can I do (without altering the SQL code) to reduce CPU utilization? It seems that the new Index helps, but now the TOP operator cost went up, although, it seems to run fast enough, less than a second locally.
When running remotely or inside the actual sproc, The avg execution time is around 9 seconds but I've seen clients at 20. And CPU cost is about 90% of more of that. It does not sound like much, but this query/store procedure is common on hundreds of clients, which shared same schema. It runs or hits MyTable which is kind of our central database which keeps metadata for the clients.
All SQL boxes run on VMware and reside on same subnet.
Forum Etiquette: How to post Reporting Services problems

[/url]

Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
[/url]

How to Post Performance Problems - by Gail Shaw
[/url]

Jack Corbett SSC Guru Points: 184394 More actions · Answer 1

What is the security context the linked server is using? Unless that user has pretty elevated permissions it can't access the statistics in order to find the best execution plan (apparently fixed in 2012 SP1). See Tom LaRock's post about Linked Server performance[/url].

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 2

sql-lover (5/1/2015)
Hi,
I am tuning the following CPU intensive query, which is also a remote query for a given store procedure:
INSERT INTO #t
SELECT top 1 @DOB, col2, col3
FROM Server1.MyDatabase.dbo.MyTable
WHERE col4 = @ZipCode
AND col1 = @DOB
ORDER BY MetricDate DESC
Below is the execution plan for that particular TSQL statement inside the store procedure:
And below is the "local" plan, when running that piece alone, in the local server, which would be Server1:
Now, there is a nasty Index on MyTable that I am trying to improve; it is basically how I found about this particular store procedure, while checking that Index. It is a very fat one, almost all columns are part of the Index. This is a leftover or old work when we did not have a DBA.
Anyway...the keys for that fat Index are:
col4, col1, MetricDate , col2, col3 (that order) ... with included columns on col5, col6 (which are not being used on this particular statement). And there is also a Cluster Index on col1.
I managed to change the Index on my lab to:
col4, MetricDate with col2 and col3 as Included columns. Below is the new execution plan if that query is run locally:
My question is ... what else can I do (without altering the SQL code) to reduce CPU utilization? It seems that the new Index helps, but now the TOP operator cost went up, although, it seems to run fast enough, less than a second locally.
When running remotely or inside the actual sproc, The avg execution time is around 9 seconds but I've seen clients at 20. And CPU cost is about 90% of more of that. It does not sound like much, but this query/store procedure is common on hundreds of clients, which shared same schema. It runs or hits MyTable which is kind of our central database which keeps metadata for the clients.
All SQL boxes run on VMware and reside on same subnet.

Give the following a try:

INSERT INTO #t

SELECT DOB, col2, col3 FROM OPENQUERY([Server1],'

SELECT top 1 @DOB, col2, col3

FROM MyDatabase.dbo.MyTable

WHERE col4 = @ZipCode

AND col1 = @DOB

ORDER BY MetricDate DESC')dt(DOB, col2, col3);

sql-lover SSCoach Points: 18530 More actions · Answer 3

Jack Corbett (5/1/2015)
What is the security context the linked server is using? Unless that user has pretty elevated permissions it can't access the statistics in order to find the best execution plan (apparently fixed in 2012 SP1). See Tom LaRock's post about Linked Server performance[/url].

Scratch that one....

sql-lover SSCoach Points: 18530 More actions · Answer 4

Lynn Pettis (5/1/2015)
Give the following a try:
INSERT INTO #t
SELECT DOB, col2, col3 FROM OPENQUERY([Server1],'
SELECT top 1 @DOB, col2, col3
FROM MyDatabase.dbo.MyTable
WHERE col4 = @ZipCode
AND col1 = @DOB
ORDER BY MetricDate DESC')dt(DOB, col2, col3);

Well... that's changing the code, which I am trying to avoid. But I can check on my lab... what about the variables? should I declare those inside or outside?

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 5

sql-lover (5/1/2015)
Lynn Pettis (5/1/2015)
Give the following a try:
INSERT INTO #t
SELECT DOB, col2, col3 FROM OPENQUERY([Server1],'
SELECT top 1 @DOB, col2, col3
FROM MyDatabase.dbo.MyTable
WHERE col4 = @ZipCode
AND col1 = @DOB
ORDER BY MetricDate DESC')dt(DOB, col2, col3);
Well... that's changing the code, which I am trying to avoid. But I can check on my lab... what about the variables? should I declare those inside or outside?

Actually, as is it won't work. I forgot about the variable. It has to be made dynamic and I can't remember how to make this work off the top of my head. I'll need a little time to work it, but it can be done.

I'll work it when I come back from the lab as I need to test some code I am working on.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 6

I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

sql-lover SSCoach Points: 18530 More actions · Answer 7

TheSQLGuru (5/1/2015)
I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.

Thanks for reply.

Maybe I am not following you, but the query is already inside an sproc. The sproc is called on Server2, then is executed, connect to Server1 (where the table resides) and give the result back. Are you saying that the select statement should reside on Server1? If that's the case, that's another no, no. This is actually a major, major code redesign and simply is not possible.

sql-lover SSCoach Points: 18530 More actions · Answer 8

Lynn Pettis (5/1/2015)
sql-lover (5/1/2015)
Lynn Pettis (5/1/2015)
Give the following a try:
INSERT INTO #t
SELECT DOB, col2, col3 FROM OPENQUERY([Server1],'
SELECT top 1 @DOB, col2, col3
FROM MyDatabase.dbo.MyTable
WHERE col4 = @ZipCode
AND col1 = @DOB
ORDER BY MetricDate DESC')dt(DOB, col2, col3);
Well... that's changing the code, which I am trying to avoid. But I can check on my lab... what about the variables? should I declare those inside or outside?
Actually, as is it won't work. I forgot about the variable. It has to be made dynamic and I can't remember how to make this work off the top of my head. I'll need a little time to work it, but it can be done.
I'll work it when I come back from the lab as I need to test some code I am working on.

I am rusty on that too, maybe this can help me a bit: http://stackoverflow.com/questions/3378496/including-parameters-in-openquery but is another TSQL code change... but I can test it ...

sql-lover SSCoach Points: 18530 More actions · Answer 9

I've started to think that is because MyTable is a "hot spot". Hundreds of clients hitting the same table, to get a different set and return that back to Server2, Server3, etc, from Server1, is resource intensive.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 10

sql-lover (5/1/2015)
TheSQLGuru (5/1/2015)
I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.
Thanks for reply.
Maybe I am not following you, but the query is already inside an sproc. The sproc is called on Server2, then is executed, connect to Server1 (where the table resides) and give the result back. Are you saying that the select statement should reside on Server1? If that's the case, that's another no, no. This is actually a major, major code redesign and simply is not possible.

The code should reside, and execute, on the server where the table is. Can't see why it is a "major, major code redesign" to simply call a sproc that inside itself calls a sproc on a different server. That isn't a change to the calling mechanism at all. But if you can't do it for some reason then you can't.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

mister.magoo SSC-Forever Points: 47174 More actions · Answer 11

Have you checked the plan that is being used on Server1 when the query is run remotely like that from Server2?

If it is taking 20 seconds, you should be able to view the execution plan using sp_whoisactive (simplest way I know - other than if you have your own script) on Server1.

So many times I have seen a simple query turn into a nightmare just because it involves a linked server, I had one the other day where the "local" plan would have been a quick index seek, but for some reason, when remoted it turned into a full table scan and it wasn't permissions. Fixing it was easy, but did require code changes.

I think usually the problems come from collation mismatches and/or one or more predicates being off-index, and I can't think of a single one I've had that could be fixed without code or schema changes.

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

sql-lover SSCoach Points: 18530 More actions · Answer 12

TheSQLGuru (5/1/2015)
sql-lover (5/1/2015)
TheSQLGuru (5/1/2015)
I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.
Thanks for reply.
Maybe I am not following you, but the query is already inside an sproc. The sproc is called on Server2, then is executed, connect to Server1 (where the table resides) and give the result back. Are you saying that the select statement should reside on Server1? If that's the case, that's another no, no. This is actually a major, major code redesign and simply is not possible.
The code should reside, and execute, on the server where the table is. Can't see why it is a "major, major code redesign" to simply call a sproc that inside itself calls a sproc on a different server. That isn't a change to the calling mechanism at all. But if you can't do it for some reason then you can't.

Thousands of store procedures to change, lot of dependencies. It is a major change. I know the app, you don't.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 13

sql-lover (5/3/2015)
TheSQLGuru (5/1/2015)
sql-lover (5/1/2015)
TheSQLGuru (5/1/2015)
I didn't read deeply (TL;DR) :), but the first thing I would do is put the query into a sproc on the remote side and call that from local box.
Thanks for reply.
Maybe I am not following you, but the query is already inside an sproc. The sproc is called on Server2, then is executed, connect to Server1 (where the table resides) and give the result back. Are you saying that the select statement should reside on Server1? If that's the case, that's another no, no. This is actually a major, major code redesign and simply is not possible.
The code should reside, and execute, on the server where the table is. Can't see why it is a "major, major code redesign" to simply call a sproc that inside itself calls a sproc on a different server. That isn't a change to the calling mechanism at all. But if you can't do it for some reason then you can't.
Thousands of store procedures to change, lot of dependencies. It is a major change. I know the app, you don't.

I don't care about your app or it's thousands of sprocs, nor do I need to know a thing about it other than it calls a sproc that does a remote server SELECT. I was addressing your singular request to address performance in that single sproc. And my solution CAN be done without a single change to anything other than the inside of the called sproc and making a simple sproc on the remote server that is called by that initial sproc. My solution carries a high likelyhood of success - I have seen it over and over in almost 20 years of working with SQL Server. No redesign of any form required.

Now, if you have tons of other cases where have this same execution mechanism those can also be addressed on an as-needed basis, again without any change to the underlying calling app.

Signing off before we get too contentious. I truly wish you the best of luck in solving your performance problem. 🙂

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

sqlExperts.org Valued Member Points: 60 More actions · Answer 14

you probably need to rewrite the query and remove "TOP" clause - it executed on local server, not remote, so you pulling entire resultset between boxes.

simplest way

Declare @MetricDate smalldatetime

Select @MetricDate = MAX(MetricDate) from Server1.MyDatabase.dbo.MyTable where col4 = @ZipCol AND col1 = @DOB

SELECT @DOB, col2, col3

FROM Server1.MyDatabase.dbo.MyTable

WHERE col4 = @ZipCode

AND col1 = @DOB

AND col5 = @MetricDate

best way - OPENQUERY per suggestion above

index should be by col4,col1,MetricDate - your modification misses col1