April 30, 2012 at 6:24 am
does anyone have any suggestions to improve the query below?
SELECT
dh.deviceid, dh.archived, lastpoll, esttemperature, esttemperature_alarm, staticpressure, staticpressure_alarm, differentialpressure, differentialpressure_alarm, backflowtime,
energy, tubingpressure, tubingpressure_alarm, casingpressure, casingpressure_alarm, gasspotflowrate, gasspotflowrate_alarm, battery, battery_alarm,
orificesize, CASE WHEN DATEADD(hour, -6, getDate()) > lastpoll THEN 1 ELSE 0 END as lastpolltime_alm, contracthr,
CASE WHEN flags > 0 THEN 1 ELSE 0 END, week_mcf_avg, month_mcf_avg, (gasspotflowrate - week_mcf_avg) as week_mcf_avg_diff
FROM devicehistory dh
LEFT JOIN (SELECT deviceid, archived, AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg, AVG(d2.estgasvolmcf) AS month_mcf_avg FROM devicehistory d2 WHERE recorddate > DATEADD(day, -30, getDate()) GROUP BY deviceid, archived) d30 ON d30.deviceid = dh.deviceid AND d30.archived = dh.archived
LEFT JOIN (SELECT deviceid, COUNT(id) AS flags FROM memos WHERE DATEADD(day, 15, time) > getDate() GROUP BY deviceid) n ON n.deviceid = dh.deviceid
WHERE dh.deviceid IN (1485, 1486, 1487, 1488) AND CONVERT(DATE, recorddate) = '2012-04-25'
This list of deviceid's can have up to 1600 unique id's in it, so this is alot of data that has to be processed
WHERE dh.deviceid IN (1485, 1486, 1487, 1488)
this is the part that is slowing the query down:
LEFT JOIN (SELECT deviceid, archived, AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg, AVG(d2.estgasvolmcf) AS month_mcf_avg FROM devicehistory d2 WHERE recorddate > DATEADD(day, -30, getDate()) GROUP BY deviceid, archived) d30 ON d30.deviceid = dh.deviceid AND d30.archived = dh.archived
indexes on the devicehistory table are as follows:
primary key = id
recorddate ascending = non unique, non clustered
datesubmitted ascending = non unique, non clustered
archived ascending = non unique, non clustered
deviceid ascending = non unique, non clustered
the devicehistory table has 2 rows per day per device. 1 is archived and 1 is non archived.
I can provide more info if needed. I am pretty new to sql and would appreciate any help!
April 30, 2012 at 7:11 pm
bump. can anyone point me in the right direction?
April 30, 2012 at 9:54 pm
Hi diat150,
Looks complicated... hopefully if you can gather some clues for me I can help out.
Please run the SET SHOWPLAN_ALL ON command followed by your query. For example:
SET SHOWPLAN_ALL ON;
SELECT dh.deviceid, dh.archived, lastpoll,...
This will generate a query execution plan.
1) Copy the plan and post it here.
Then, run SET SHOWPLAN_ALL OFF; so you won't have to stare at query execution plans all day.
In addition to item 1, please answer as many of the following questions as you can...
How is this query being run:
2) As a stored procedure?
3) AS embedded SQL in a C# program or Java, etc.?
4) Is this dynamic SQL?
5) When was the last time you ran update statistics on your tables & indexes?
6) Please provide a row count for each of the tables involved in the query.
Answer the questions as best you can and post the query plan and let's figure out what's going on.
Let me know if you need clarification on any of the above items.
- victor di leo
May 1, 2012 at 6:14 am
The query looks ok, so I'd focus on the data and your index density. Here's a blog post I wrote about it. I found that determining if you're looking at primarily inclusive or exclusive filtering changed the useful density settings of the index.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 1, 2012 at 6:19 am
Looking at the execution plan, you're returning 20k rows from the spatial index and then filtering that down to three values. Performance is going to stink. You need a better way to filter more rows. The TOP operation is taking 200k to only return 3. That's where I'd focus first, then worry about the spatial data and whether or not it's doing what you need.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 1, 2012 at 6:23 am
two issues i see:
two different WHERE statements have a function on a datetime column...
WHERE DATEADD(day, 15, time) > GETDATE()
--and later:
AND CONVERT(DATE, recorddate) = '2012-04-25'
this requires SQL to do a table scan to get those datetime values, convert ALL the values in the table, and then do the compare.
you can change it so it can use an index (if an index exists on these two columns, and get better performance as well:
WHERE [time] > DATEADD(day, -15, GETDATE())
--and later:
AND recorddate >= '2012-04-25 00:00:00.000'
AND recorddate < '2012-04-26 00:00:00.000
here's my reformatted version i think will work
SELECT dh.deviceid,
dh.archived,
lastpoll,
esttemperature,
esttemperature_alarm,
staticpressure,
staticpressure_alarm,
differentialpressure,
differentialpressure_alarm,
backflowtime,
energy,
tubingpressure,
tubingpressure_alarm,
casingpressure,
casingpressure_alarm,
gasspotflowrate,
gasspotflowrate_alarm,
battery,
battery_alarm,
orificesize,
CASE
WHEN DATEADD(hour, -6, GETDATE()) > lastpoll
THEN 1
ELSE 0
END AS lastpolltime_alm,
contracthr,
CASE
WHEN flags > 0
THEN 1
ELSE 0
END,
week_mcf_avg,
month_mcf_avg,
( gasspotflowrate - week_mcf_avg ) AS week_mcf_avg_diff
FROM devicehistory dh
LEFT JOIN (SELECT deviceid,
archived,
AVG(CASE
WHEN recorddate > DATEADD(day, -7, GETDATE())
THEN d2.estgasvolmcf
ELSE NULL
END) AS week_mcf_avg,
AVG(d2.estgasvolmcf) AS month_mcf_avg
FROM devicehistory d2
WHERE recorddate > DATEADD(day, -30, GETDATE())
GROUP BY
deviceid,
archived) d30 ON d30.deviceid = dh.deviceid
AND d30.archived = dh.archived
LEFT JOIN (SELECT deviceid,
COUNT(id) AS flags
FROM memos
--no functions on the table column in WHERE statments:
--WHERE DATEADD(day, 15, time) > GETDATE()
WHERE [time] > DATEADD(day, -15, GETDATE())
GROUP BY deviceid) n ON n.deviceid = dh.deviceid
WHERE dh.deviceid IN ( 1485, 1486, 1487, 1488 )
--no functions on the table column in WHERE statments:
--AND CONVERT(DATE, recorddate) = '2012-04-25'
AND recorddate >= '2012-04-25 00:00:00.000'
AND recorddate < '2012-04-26 00:00:00.000
Lowell
May 1, 2012 at 7:10 am
1) Copy the plan and post it here.
attached
How is this query being run:
2) As a stored procedure?
no
3) AS embedded SQL in a C# program or Java, etc.?
yes, java
4) Is this dynamic SQL?
never heard of this, but I dont think so from a quick google search
5) When was the last time you ran update statistics on your tables & indexes?
I havent done this. Like I said, i am pretty new to sql outside of running basic queries
6) Please provide a row count for each of the tables involved in the query.
devicehistory = 758137
devices 1625
memos 7969
Thanks for the help
May 1, 2012 at 8:20 am
Lowell (5/1/2012)
two issues i see:two different WHERE statements have a function on a datetime column...
WHERE DATEADD(day, 15, time) > GETDATE()
--and later:
AND CONVERT(DATE, recorddate) = '2012-04-25'
this requires SQL to do a table scan to get those datetime values, convert ALL the values in the table, and then do the compare.
you can change it so it can use an index (if an index exists on these two columns, and get better performance as well:
WHERE [time] > DATEADD(day, -15, GETDATE())
--and later:
AND recorddate >= '2012-04-25 00:00:00.000'
AND recorddate < '2012-04-26 00:00:00.000
here's my reformatted version i think will work
SELECT dh.deviceid,
dh.archived,
lastpoll,
esttemperature,
esttemperature_alarm,
staticpressure,
staticpressure_alarm,
differentialpressure,
differentialpressure_alarm,
backflowtime,
energy,
tubingpressure,
tubingpressure_alarm,
casingpressure,
casingpressure_alarm,
gasspotflowrate,
gasspotflowrate_alarm,
battery,
battery_alarm,
orificesize,
CASE
WHEN DATEADD(hour, -6, GETDATE()) > lastpoll
THEN 1
ELSE 0
END AS lastpolltime_alm,
contracthr,
CASE
WHEN flags > 0
THEN 1
ELSE 0
END,
week_mcf_avg,
month_mcf_avg,
( gasspotflowrate - week_mcf_avg ) AS week_mcf_avg_diff
FROM devicehistory dh
LEFT JOIN (SELECT deviceid,
archived,
AVG(CASE
WHEN recorddate > DATEADD(day, -7, GETDATE())
THEN d2.estgasvolmcf
ELSE NULL
END) AS week_mcf_avg,
AVG(d2.estgasvolmcf) AS month_mcf_avg
FROM devicehistory d2
WHERE recorddate > DATEADD(day, -30, GETDATE())
GROUP BY
deviceid,
archived) d30 ON d30.deviceid = dh.deviceid
AND d30.archived = dh.archived
LEFT JOIN (SELECT deviceid,
COUNT(id) AS flags
FROM memos
--no functions on the table column in WHERE statments:
--WHERE DATEADD(day, 15, time) > GETDATE()
WHERE [time] > DATEADD(day, -15, GETDATE())
GROUP BY deviceid) n ON n.deviceid = dh.deviceid
WHERE dh.deviceid IN ( 1485, 1486, 1487, 1488 )
--no functions on the table column in WHERE statments:
--AND CONVERT(DATE, recorddate) = '2012-04-25'
AND recorddate >= '2012-04-25 00:00:00.000'
AND recorddate < '2012-04-26 00:00:00.000
I tried this but it didnt return any values for the avg or diff column
May 1, 2012 at 10:03 am
Thanks for providing the query plan and other info.
Regarding the query plan:
I don't see any table scans, so that's good.
It looks like the query plan is probably not being cached, and the query is not being parameterized (no "@" in the query plan output).
This means the query optimizer is always processing a new query plan for each execution. This takes up some time for each execution, but this is not always bad, because (as I understand it) your WHERE IN can sometimes have only a few deviceid's or it might have as many as 16,000 of them. This difference could require very different query plans for efficient execution, but I'll need your help to make that determination:
7) does the query plan you provided come from the same query that appears in your post?
8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?
9) does the query with several deviceid's run faster than the one with 16,000?
10) what's the difference in runtimes between the two?
Also...
11) When were your indexes last updated? Try the queries in this link (I can't test those queries at this moment, but stackoverflow.com is very reliable):
http://stackoverflow.com/questions/970632/when-were-index-statistics-last-updated
12) Has the data in the table changed much since the stats were last updated? Lots up updates, inserts and deletes?
May 1, 2012 at 11:53 am
vicdileo (5/1/2012)
Thanks for providing the query plan and other info.Regarding the query plan:
I don't see any table scans, so that's good.
It looks like the query plan is probably not being cached, and the query is not being parameterized (no "@" in the query plan output).
This means the query optimizer is always processing a new query plan for each execution. This takes up some time for each execution, but this is not always bad, because (as I understand it) your WHERE IN can sometimes have only a few deviceid's or it might have as many as 16,000 of them. This difference could require very different query plans for efficient execution, but I'll need your help to make that determination:
7) does the query plan you provided come from the same query that appears in your post?
8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?
9) does the query with several deviceid's run faster than the one with 16,000?
10) what's the difference in runtimes between the two?
Also...
11) When were your indexes last updated? Try the queries in this link (I can't test those queries at this moment, but stackoverflow.com is very reliable):
http://stackoverflow.com/questions/970632/when-were-index-statistics-last-updated
12) Has the data in the table changed much since the stats were last updated? Lots up updates, inserts and deletes?
here is the index update results:
Sat Apr 28 06:23:20 CDT 2012,devicehistory,IX_devicehistory_2
Mon Mar 26 13:00:43 CDT 2012,devicehistory,IX_devicehistory
Fri Mar 23 09:53:17 CDT 2012,devicehistory,IX_Devicehistory_DeviceID
Mon Mar 19 09:28:33 CDT 2012,devicehistory,PK_devicehistory
Tue Mar 13 14:33:44 CDT 2012,devicehistory,IX_devicehistory_1
Tue Dec 06 22:14:46 CST 2011,devices,IX_devices
Tue Oct 25 13:48:30 CDT 2011,devices,PK_devices
I am going to answer the rest shortly.
May 1, 2012 at 12:34 pm
First, how quickly does the following run:
SELECT
deviceid,
archived,
AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg,
AVG(d2.estgasvolmcf) AS month_mcf_avg
FROM
devicehistory d2
WHERE
recorddate > DATEADD(day, -30, getDate())
GROUP BY
deviceid,
archived
Second, is this part of a stored procedure or a stand alone query?
May 1, 2012 at 12:52 pm
7) does the query plan you provided come from the same query that appears in your post?
yes
8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?
its actually 1600+, Ive attached one for 720 devices
9) does the query with several deviceid's run faster than the one with 16,000? yes
10) what's the difference in runtimes between the two?
yes, for 12 devices, it takes about 6 seconds, and for 720 devices it takes about 2 minutes
May 1, 2012 at 12:55 pm
Lynn Pettis (5/1/2012)
First, how quickly does the following run:
SELECT
deviceid,
archived,
AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg,
AVG(d2.estgasvolmcf) AS month_mcf_avg
FROM
devicehistory d2
WHERE
recorddate > DATEADD(day, -30, getDate())
GROUP BY
deviceid,
archived
Second, is this part of a stored procedure or a stand alone query?
that query runs in less than a second, and this is a standalone query, not stored procedure.
May 1, 2012 at 1:49 pm
RE:
8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?
its actually 1600+, Ive attached one for 720 devices
I don't see the attachment for 720 devices.
May 1, 2012 at 2:14 pm
When providing the actual execution, instead of the XML version, save the graphical version as a .sqlplan file and attach that to your post.
Viewing 15 posts - 1 through 15 (of 32 total)
You must be logged in to reply to this topic. Login to reply