need help with query speed.

Question

Post reply

need help with query speed.

diat150

SSC Journeyman

Points: 96
More actions
April 30, 2012 at 6:24 am

#260000

does anyone have any suggestions to improve the query below?
SELECT
dh.deviceid, dh.archived, lastpoll, esttemperature, esttemperature_alarm, staticpressure, staticpressure_alarm, differentialpressure, differentialpressure_alarm, backflowtime,
energy, tubingpressure, tubingpressure_alarm, casingpressure, casingpressure_alarm, gasspotflowrate, gasspotflowrate_alarm, battery, battery_alarm,
orificesize, CASE WHEN DATEADD(hour, -6, getDate()) > lastpoll THEN 1 ELSE 0 END as lastpolltime_alm, contracthr,
CASE WHEN flags > 0 THEN 1 ELSE 0 END, week_mcf_avg, month_mcf_avg, (gasspotflowrate - week_mcf_avg) as week_mcf_avg_diff
FROM devicehistory dh
LEFT JOIN (SELECT deviceid, archived, AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg, AVG(d2.estgasvolmcf) AS month_mcf_avg FROM devicehistory d2 WHERE recorddate > DATEADD(day, -30, getDate()) GROUP BY deviceid, archived) d30 ON d30.deviceid = dh.deviceid AND d30.archived = dh.archived
LEFT JOIN (SELECT deviceid, COUNT(id) AS flags FROM memos WHERE DATEADD(day, 15, time) > getDate() GROUP BY deviceid) n ON n.deviceid = dh.deviceid
WHERE dh.deviceid IN (1485, 1486, 1487, 1488) AND CONVERT(DATE, recorddate) = '2012-04-25'
This list of deviceid's can have up to 1600 unique id's in it, so this is alot of data that has to be processed
WHERE dh.deviceid IN (1485, 1486, 1487, 1488)
this is the part that is slowing the query down:
LEFT JOIN (SELECT deviceid, archived, AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg, AVG(d2.estgasvolmcf) AS month_mcf_avg FROM devicehistory d2 WHERE recorddate > DATEADD(day, -30, getDate()) GROUP BY deviceid, archived) d30 ON d30.deviceid = dh.deviceid AND d30.archived = dh.archived
indexes on the devicehistory table are as follows:
primary key = id
recorddate ascending = non unique, non clustered
datesubmitted ascending = non unique, non clustered
archived ascending = non unique, non clustered
deviceid ascending = non unique, non clustered
the devicehistory table has 2 rows per day per device. 1 is archived and 1 is non archived.
I can provide more info if needed. I am pretty new to sql and would appreciate any help!

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply

diat150 SSC Journeyman Points: 96 More actions · Answer 1

diat150

SSC Journeyman

Points: 96

April 30, 2012 at 7:11 pm

#1481563

bump. can anyone point me in the right direction?

vicdileo SSC Veteran Points: 293 More actions · Answer 2

Hi diat150,

Looks complicated... hopefully if you can gather some clues for me I can help out.

Please run the SET SHOWPLAN_ALL ON command followed by your query. For example:

SET SHOWPLAN_ALL ON;

SELECT dh.deviceid, dh.archived, lastpoll,...

This will generate a query execution plan.

1) Copy the plan and post it here.

Then, run SET SHOWPLAN_ALL OFF; so you won't have to stare at query execution plans all day.

In addition to item 1, please answer as many of the following questions as you can...

How is this query being run:

2) As a stored procedure?

3) AS embedded SQL in a C# program or Java, etc.?

4) Is this dynamic SQL?

5) When was the last time you ran update statistics on your tables & indexes?

6) Please provide a row count for each of the tables involved in the query.

Answer the questions as best you can and post the query plan and let's figure out what's going on.

Let me know if you need clarification on any of the above items.

- victor di leo

Grant Fritchey SSC Guru Points: 399379 More actions · Answer 3

The query looks ok, so I'd focus on the data and your index density. Here's a blog post I wrote about it. I found that determining if you're looking at primarily inclusive or exclusive filtering changed the useful density settings of the index.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Grant Fritchey SSC Guru Points: 399379 More actions · Answer 4

Looking at the execution plan, you're returning 20k rows from the spatial index and then filtering that down to three values. Performance is going to stink. You need a better way to filter more rows. The TOP operation is taking 200k to only return 3. That's where I'd focus first, then worry about the spatial data and whether or not it's doing what you need.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Lowell SSC Guru Points: 323755 More actions · Answer 5

two issues i see:

two different WHERE statements have a function on a datetime column...

WHERE DATEADD(day, 15, time) > GETDATE()

--and later:

AND CONVERT(DATE, recorddate) = '2012-04-25'

this requires SQL to do a table scan to get those datetime values, convert ALL the values in the table, and then do the compare.

you can change it so it can use an index (if an index exists on these two columns, and get better performance as well:

WHERE [time] > DATEADD(day, -15, GETDATE())

--and later:

AND recorddate >= '2012-04-25 00:00:00.000'

AND recorddate < '2012-04-26 00:00:00.000

here's my reformatted version i think will work

SELECT dh.deviceid,

dh.archived,

lastpoll,

esttemperature,

esttemperature_alarm,

staticpressure,

staticpressure_alarm,

differentialpressure,

differentialpressure_alarm,

backflowtime,

energy,

tubingpressure,

tubingpressure_alarm,

casingpressure,

casingpressure_alarm,

gasspotflowrate,

gasspotflowrate_alarm,

battery,

battery_alarm,

orificesize,

CASE

WHEN DATEADD(hour, -6, GETDATE()) > lastpoll

THEN 1

ELSE 0

END AS lastpolltime_alm,

contracthr,

CASE

WHEN flags > 0

THEN 1

ELSE 0

END,

week_mcf_avg,

month_mcf_avg,

( gasspotflowrate - week_mcf_avg ) AS week_mcf_avg_diff

FROM devicehistory dh

LEFT JOIN (SELECT deviceid,

archived,

AVG(CASE

WHEN recorddate > DATEADD(day, -7, GETDATE())

THEN d2.estgasvolmcf

ELSE NULL

END) AS week_mcf_avg,

AVG(d2.estgasvolmcf) AS month_mcf_avg

FROM devicehistory d2

WHERE recorddate > DATEADD(day, -30, GETDATE())

GROUP BY

deviceid,

archived) d30 ON d30.deviceid = dh.deviceid

AND d30.archived = dh.archived

LEFT JOIN (SELECT deviceid,

COUNT(id) AS flags

FROM memos

--no functions on the table column in WHERE statments:

--WHERE DATEADD(day, 15, time) > GETDATE()

WHERE [time] > DATEADD(day, -15, GETDATE())

GROUP BY deviceid) n ON n.deviceid = dh.deviceid

WHERE dh.deviceid IN ( 1485, 1486, 1487, 1488 )

--no functions on the table column in WHERE statments:

--AND CONVERT(DATE, recorddate) = '2012-04-25'

AND recorddate >= '2012-04-25 00:00:00.000'

AND recorddate < '2012-04-26 00:00:00.000

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

diat150 SSC Journeyman Points: 96 More actions · Answer 6

1) Copy the plan and post it here.

attached

How is this query being run:

2) As a stored procedure?

no

3) AS embedded SQL in a C# program or Java, etc.?

yes, java

4) Is this dynamic SQL?

never heard of this, but I dont think so from a quick google search

5) When was the last time you ran update statistics on your tables & indexes?

I havent done this. Like I said, i am pretty new to sql outside of running basic queries

6) Please provide a row count for each of the tables involved in the query.

devicehistory = 758137

devices 1625

memos 7969

Thanks for the help

diat150 SSC Journeyman Points: 96 More actions · Answer 7

Lowell (5/1/2012)
two issues i see:
two different WHERE statements have a function on a datetime column...
WHERE DATEADD(day, 15, time) > GETDATE()
--and later:
AND CONVERT(DATE, recorddate) = '2012-04-25'
this requires SQL to do a table scan to get those datetime values, convert ALL the values in the table, and then do the compare.
you can change it so it can use an index (if an index exists on these two columns, and get better performance as well:
WHERE [time] > DATEADD(day, -15, GETDATE())
--and later:
AND recorddate >= '2012-04-25 00:00:00.000'
AND recorddate < '2012-04-26 00:00:00.000
here's my reformatted version i think will work
SELECT dh.deviceid,
dh.archived,
lastpoll,
esttemperature,
esttemperature_alarm,
staticpressure,
staticpressure_alarm,
differentialpressure,
differentialpressure_alarm,
backflowtime,
energy,
tubingpressure,
tubingpressure_alarm,
casingpressure,
casingpressure_alarm,
gasspotflowrate,
gasspotflowrate_alarm,
battery,
battery_alarm,
orificesize,
CASE
WHEN DATEADD(hour, -6, GETDATE()) > lastpoll
THEN 1
ELSE 0
END AS lastpolltime_alm,
contracthr,
CASE
WHEN flags > 0
THEN 1
ELSE 0
END,
week_mcf_avg,
month_mcf_avg,
( gasspotflowrate - week_mcf_avg ) AS week_mcf_avg_diff
FROM devicehistory dh
LEFT JOIN (SELECT deviceid,
archived,
AVG(CASE
WHEN recorddate > DATEADD(day, -7, GETDATE())
THEN d2.estgasvolmcf
ELSE NULL
END) AS week_mcf_avg,
AVG(d2.estgasvolmcf) AS month_mcf_avg
FROM devicehistory d2
WHERE recorddate > DATEADD(day, -30, GETDATE())
GROUP BY
deviceid,
archived) d30 ON d30.deviceid = dh.deviceid
AND d30.archived = dh.archived
LEFT JOIN (SELECT deviceid,
COUNT(id) AS flags
FROM memos
--no functions on the table column in WHERE statments:
--WHERE DATEADD(day, 15, time) > GETDATE()
WHERE [time] > DATEADD(day, -15, GETDATE())
GROUP BY deviceid) n ON n.deviceid = dh.deviceid
WHERE dh.deviceid IN ( 1485, 1486, 1487, 1488 )
--no functions on the table column in WHERE statments:
--AND CONVERT(DATE, recorddate) = '2012-04-25'
AND recorddate >= '2012-04-25 00:00:00.000'
AND recorddate < '2012-04-26 00:00:00.000

I tried this but it didnt return any values for the avg or diff column

vicdileo SSC Veteran Points: 293 More actions · Answer 8

Thanks for providing the query plan and other info.

Regarding the query plan:

I don't see any table scans, so that's good.

It looks like the query plan is probably not being cached, and the query is not being parameterized (no "@" in the query plan output).

This means the query optimizer is always processing a new query plan for each execution. This takes up some time for each execution, but this is not always bad, because (as I understand it) your WHERE IN can sometimes have only a few deviceid's or it might have as many as 16,000 of them. This difference could require very different query plans for efficient execution, but I'll need your help to make that determination:

7) does the query plan you provided come from the same query that appears in your post?

8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?

9) does the query with several deviceid's run faster than the one with 16,000?

10) what's the difference in runtimes between the two?

Also...

11) When were your indexes last updated? Try the queries in this link (I can't test those queries at this moment, but stackoverflow.com is very reliable):

http://stackoverflow.com/questions/970632/when-were-index-statistics-last-updated

12) Has the data in the table changed much since the stats were last updated? Lots up updates, inserts and deletes?

diat150 SSC Journeyman Points: 96 More actions · Answer 9

vicdileo (5/1/2012)
Thanks for providing the query plan and other info.
Regarding the query plan:
I don't see any table scans, so that's good.
It looks like the query plan is probably not being cached, and the query is not being parameterized (no "@" in the query plan output).
This means the query optimizer is always processing a new query plan for each execution. This takes up some time for each execution, but this is not always bad, because (as I understand it) your WHERE IN can sometimes have only a few deviceid's or it might have as many as 16,000 of them. This difference could require very different query plans for efficient execution, but I'll need your help to make that determination:
7) does the query plan you provided come from the same query that appears in your post?
8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?
9) does the query with several deviceid's run faster than the one with 16,000?
10) what's the difference in runtimes between the two?
Also...
11) When were your indexes last updated? Try the queries in this link (I can't test those queries at this moment, but stackoverflow.com is very reliable):
http://stackoverflow.com/questions/970632/when-were-index-statistics-last-updated
12) Has the data in the table changed much since the stats were last updated? Lots up updates, inserts and deletes?

here is the index update results:

Sat Apr 28 06:23:20 CDT 2012,devicehistory,IX_devicehistory_2

Mon Mar 26 13:00:43 CDT 2012,devicehistory,IX_devicehistory

Fri Mar 23 09:53:17 CDT 2012,devicehistory,IX_Devicehistory_DeviceID

Mon Mar 19 09:28:33 CDT 2012,devicehistory,PK_devicehistory

Tue Mar 13 14:33:44 CDT 2012,devicehistory,IX_devicehistory_1

Tue Dec 06 22:14:46 CST 2011,devices,IX_devices

Tue Oct 25 13:48:30 CDT 2011,devices,PK_devices

I am going to answer the rest shortly.

Lynn Pettis SSC Guru Points: 442470 More actions · Answer 10

First, how quickly does the following run:

SELECT

deviceid,

archived,

AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg,

AVG(d2.estgasvolmcf) AS month_mcf_avg

FROM

devicehistory d2

WHERE

recorddate > DATEADD(day, -30, getDate())

GROUP BY

deviceid,

archived

Second, is this part of a stored procedure or a stand alone query?

diat150 SSC Journeyman Points: 96 More actions · Answer 11

7) does the query plan you provided come from the same query that appears in your post?

yes

8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?

its actually 1600+, Ive attached one for 720 devices

9) does the query with several deviceid's run faster than the one with 16,000? yes

10) what's the difference in runtimes between the two?

yes, for 12 devices, it takes about 6 seconds, and for 720 devices it takes about 2 minutes

diat150 SSC Journeyman Points: 96 More actions · Answer 12

Lynn Pettis (5/1/2012)
First, how quickly does the following run:
SELECT
deviceid,
archived,
AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg,
AVG(d2.estgasvolmcf) AS month_mcf_avg
FROM
devicehistory d2
WHERE
recorddate > DATEADD(day, -30, getDate())
GROUP BY
deviceid,
archived
Second, is this part of a stored procedure or a stand alone query?

that query runs in less than a second, and this is a standalone query, not stored procedure.

vicdileo SSC Veteran Points: 293 More actions · Answer 13

RE:

8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?

its actually 1600+, Ive attached one for 720 devices

I don't see the attachment for 720 devices.

Lynn Pettis SSC Guru Points: 442470 More actions · Answer 14

When providing the actual execution, instead of the XML version, save the graphical version as a .sqlplan file and attach that to your post.