need help with query speed.

  • does anyone have any suggestions to improve the query below?

    SELECT

    dh.deviceid, dh.archived, lastpoll, esttemperature, esttemperature_alarm, staticpressure, staticpressure_alarm, differentialpressure, differentialpressure_alarm, backflowtime,

    energy, tubingpressure, tubingpressure_alarm, casingpressure, casingpressure_alarm, gasspotflowrate, gasspotflowrate_alarm, battery, battery_alarm,

    orificesize, CASE WHEN DATEADD(hour, -6, getDate()) > lastpoll THEN 1 ELSE 0 END as lastpolltime_alm, contracthr,

    CASE WHEN flags > 0 THEN 1 ELSE 0 END, week_mcf_avg, month_mcf_avg, (gasspotflowrate - week_mcf_avg) as week_mcf_avg_diff

    FROM devicehistory dh

    LEFT JOIN (SELECT deviceid, archived, AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg, AVG(d2.estgasvolmcf) AS month_mcf_avg FROM devicehistory d2 WHERE recorddate > DATEADD(day, -30, getDate()) GROUP BY deviceid, archived) d30 ON d30.deviceid = dh.deviceid AND d30.archived = dh.archived

    LEFT JOIN (SELECT deviceid, COUNT(id) AS flags FROM memos WHERE DATEADD(day, 15, time) > getDate() GROUP BY deviceid) n ON n.deviceid = dh.deviceid

    WHERE dh.deviceid IN (1485, 1486, 1487, 1488) AND CONVERT(DATE, recorddate) = '2012-04-25'

    This list of deviceid's can have up to 1600 unique id's in it, so this is alot of data that has to be processed

    WHERE dh.deviceid IN (1485, 1486, 1487, 1488)

    this is the part that is slowing the query down:

    LEFT JOIN (SELECT deviceid, archived, AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg, AVG(d2.estgasvolmcf) AS month_mcf_avg FROM devicehistory d2 WHERE recorddate > DATEADD(day, -30, getDate()) GROUP BY deviceid, archived) d30 ON d30.deviceid = dh.deviceid AND d30.archived = dh.archived

    indexes on the devicehistory table are as follows:

    primary key = id

    recorddate ascending = non unique, non clustered

    datesubmitted ascending = non unique, non clustered

    archived ascending = non unique, non clustered

    deviceid ascending = non unique, non clustered

    the devicehistory table has 2 rows per day per device. 1 is archived and 1 is non archived.

    I can provide more info if needed. I am pretty new to sql and would appreciate any help!

  • bump. can anyone point me in the right direction?

  • Hi diat150,

    Looks complicated... hopefully if you can gather some clues for me I can help out.

    Please run the SET SHOWPLAN_ALL ON command followed by your query. For example:

    SET SHOWPLAN_ALL ON;

    SELECT dh.deviceid, dh.archived, lastpoll,...

    This will generate a query execution plan.

    1) Copy the plan and post it here.

    Then, run SET SHOWPLAN_ALL OFF; so you won't have to stare at query execution plans all day.

    In addition to item 1, please answer as many of the following questions as you can...

    How is this query being run:

    2) As a stored procedure?

    3) AS embedded SQL in a C# program or Java, etc.?

    4) Is this dynamic SQL?

    5) When was the last time you ran update statistics on your tables & indexes?

    6) Please provide a row count for each of the tables involved in the query.

    Answer the questions as best you can and post the query plan and let's figure out what's going on.

    Let me know if you need clarification on any of the above items.

    - victor di leo

  • The query looks ok, so I'd focus on the data and your index density. Here's a blog post I wrote about it. I found that determining if you're looking at primarily inclusive or exclusive filtering changed the useful density settings of the index.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Looking at the execution plan, you're returning 20k rows from the spatial index and then filtering that down to three values. Performance is going to stink. You need a better way to filter more rows. The TOP operation is taking 200k to only return 3. That's where I'd focus first, then worry about the spatial data and whether or not it's doing what you need.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • two issues i see:

    two different WHERE statements have a function on a datetime column...

    WHERE DATEADD(day, 15, time) > GETDATE()

    --and later:

    AND CONVERT(DATE, recorddate) = '2012-04-25'

    this requires SQL to do a table scan to get those datetime values, convert ALL the values in the table, and then do the compare.

    you can change it so it can use an index (if an index exists on these two columns, and get better performance as well:

    WHERE [time] > DATEADD(day, -15, GETDATE())

    --and later:

    AND recorddate >= '2012-04-25 00:00:00.000'

    AND recorddate < '2012-04-26 00:00:00.000

    here's my reformatted version i think will work

    SELECT dh.deviceid,

    dh.archived,

    lastpoll,

    esttemperature,

    esttemperature_alarm,

    staticpressure,

    staticpressure_alarm,

    differentialpressure,

    differentialpressure_alarm,

    backflowtime,

    energy,

    tubingpressure,

    tubingpressure_alarm,

    casingpressure,

    casingpressure_alarm,

    gasspotflowrate,

    gasspotflowrate_alarm,

    battery,

    battery_alarm,

    orificesize,

    CASE

    WHEN DATEADD(hour, -6, GETDATE()) > lastpoll

    THEN 1

    ELSE 0

    END AS lastpolltime_alm,

    contracthr,

    CASE

    WHEN flags > 0

    THEN 1

    ELSE 0

    END,

    week_mcf_avg,

    month_mcf_avg,

    ( gasspotflowrate - week_mcf_avg ) AS week_mcf_avg_diff

    FROM devicehistory dh

    LEFT JOIN (SELECT deviceid,

    archived,

    AVG(CASE

    WHEN recorddate > DATEADD(day, -7, GETDATE())

    THEN d2.estgasvolmcf

    ELSE NULL

    END) AS week_mcf_avg,

    AVG(d2.estgasvolmcf) AS month_mcf_avg

    FROM devicehistory d2

    WHERE recorddate > DATEADD(day, -30, GETDATE())

    GROUP BY

    deviceid,

    archived) d30 ON d30.deviceid = dh.deviceid

    AND d30.archived = dh.archived

    LEFT JOIN (SELECT deviceid,

    COUNT(id) AS flags

    FROM memos

    --no functions on the table column in WHERE statments:

    --WHERE DATEADD(day, 15, time) > GETDATE()

    WHERE [time] > DATEADD(day, -15, GETDATE())

    GROUP BY deviceid) n ON n.deviceid = dh.deviceid

    WHERE dh.deviceid IN ( 1485, 1486, 1487, 1488 )

    --no functions on the table column in WHERE statments:

    --AND CONVERT(DATE, recorddate) = '2012-04-25'

    AND recorddate >= '2012-04-25 00:00:00.000'

    AND recorddate < '2012-04-26 00:00:00.000

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • 1) Copy the plan and post it here.

    attached

    How is this query being run:

    2) As a stored procedure?

    no

    3) AS embedded SQL in a C# program or Java, etc.?

    yes, java

    4) Is this dynamic SQL?

    never heard of this, but I dont think so from a quick google search

    5) When was the last time you ran update statistics on your tables & indexes?

    I havent done this. Like I said, i am pretty new to sql outside of running basic queries

    6) Please provide a row count for each of the tables involved in the query.

    devicehistory = 758137

    devices 1625

    memos 7969

    Thanks for the help

  • Lowell (5/1/2012)


    two issues i see:

    two different WHERE statements have a function on a datetime column...

    WHERE DATEADD(day, 15, time) > GETDATE()

    --and later:

    AND CONVERT(DATE, recorddate) = '2012-04-25'

    this requires SQL to do a table scan to get those datetime values, convert ALL the values in the table, and then do the compare.

    you can change it so it can use an index (if an index exists on these two columns, and get better performance as well:

    WHERE [time] > DATEADD(day, -15, GETDATE())

    --and later:

    AND recorddate >= '2012-04-25 00:00:00.000'

    AND recorddate < '2012-04-26 00:00:00.000

    here's my reformatted version i think will work

    SELECT dh.deviceid,

    dh.archived,

    lastpoll,

    esttemperature,

    esttemperature_alarm,

    staticpressure,

    staticpressure_alarm,

    differentialpressure,

    differentialpressure_alarm,

    backflowtime,

    energy,

    tubingpressure,

    tubingpressure_alarm,

    casingpressure,

    casingpressure_alarm,

    gasspotflowrate,

    gasspotflowrate_alarm,

    battery,

    battery_alarm,

    orificesize,

    CASE

    WHEN DATEADD(hour, -6, GETDATE()) > lastpoll

    THEN 1

    ELSE 0

    END AS lastpolltime_alm,

    contracthr,

    CASE

    WHEN flags > 0

    THEN 1

    ELSE 0

    END,

    week_mcf_avg,

    month_mcf_avg,

    ( gasspotflowrate - week_mcf_avg ) AS week_mcf_avg_diff

    FROM devicehistory dh

    LEFT JOIN (SELECT deviceid,

    archived,

    AVG(CASE

    WHEN recorddate > DATEADD(day, -7, GETDATE())

    THEN d2.estgasvolmcf

    ELSE NULL

    END) AS week_mcf_avg,

    AVG(d2.estgasvolmcf) AS month_mcf_avg

    FROM devicehistory d2

    WHERE recorddate > DATEADD(day, -30, GETDATE())

    GROUP BY

    deviceid,

    archived) d30 ON d30.deviceid = dh.deviceid

    AND d30.archived = dh.archived

    LEFT JOIN (SELECT deviceid,

    COUNT(id) AS flags

    FROM memos

    --no functions on the table column in WHERE statments:

    --WHERE DATEADD(day, 15, time) > GETDATE()

    WHERE [time] > DATEADD(day, -15, GETDATE())

    GROUP BY deviceid) n ON n.deviceid = dh.deviceid

    WHERE dh.deviceid IN ( 1485, 1486, 1487, 1488 )

    --no functions on the table column in WHERE statments:

    --AND CONVERT(DATE, recorddate) = '2012-04-25'

    AND recorddate >= '2012-04-25 00:00:00.000'

    AND recorddate < '2012-04-26 00:00:00.000

    I tried this but it didnt return any values for the avg or diff column

  • Thanks for providing the query plan and other info.

    Regarding the query plan:

    I don't see any table scans, so that's good.

    It looks like the query plan is probably not being cached, and the query is not being parameterized (no "@" in the query plan output).

    This means the query optimizer is always processing a new query plan for each execution. This takes up some time for each execution, but this is not always bad, because (as I understand it) your WHERE IN can sometimes have only a few deviceid's or it might have as many as 16,000 of them. This difference could require very different query plans for efficient execution, but I'll need your help to make that determination:

    7) does the query plan you provided come from the same query that appears in your post?

    8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?

    9) does the query with several deviceid's run faster than the one with 16,000?

    10) what's the difference in runtimes between the two?

    Also...

    11) When were your indexes last updated? Try the queries in this link (I can't test those queries at this moment, but stackoverflow.com is very reliable):

    http://stackoverflow.com/questions/970632/when-were-index-statistics-last-updated

    12) Has the data in the table changed much since the stats were last updated? Lots up updates, inserts and deletes?

  • vicdileo (5/1/2012)


    Thanks for providing the query plan and other info.

    Regarding the query plan:

    I don't see any table scans, so that's good.

    It looks like the query plan is probably not being cached, and the query is not being parameterized (no "@" in the query plan output).

    This means the query optimizer is always processing a new query plan for each execution. This takes up some time for each execution, but this is not always bad, because (as I understand it) your WHERE IN can sometimes have only a few deviceid's or it might have as many as 16,000 of them. This difference could require very different query plans for efficient execution, but I'll need your help to make that determination:

    7) does the query plan you provided come from the same query that appears in your post?

    8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?

    9) does the query with several deviceid's run faster than the one with 16,000?

    10) what's the difference in runtimes between the two?

    Also...

    11) When were your indexes last updated? Try the queries in this link (I can't test those queries at this moment, but stackoverflow.com is very reliable):

    http://stackoverflow.com/questions/970632/when-were-index-statistics-last-updated

    12) Has the data in the table changed much since the stats were last updated? Lots up updates, inserts and deletes?

    here is the index update results:

    Sat Apr 28 06:23:20 CDT 2012,devicehistory,IX_devicehistory_2

    Mon Mar 26 13:00:43 CDT 2012,devicehistory,IX_devicehistory

    Fri Mar 23 09:53:17 CDT 2012,devicehistory,IX_Devicehistory_DeviceID

    Mon Mar 19 09:28:33 CDT 2012,devicehistory,PK_devicehistory

    Tue Mar 13 14:33:44 CDT 2012,devicehistory,IX_devicehistory_1

    Tue Dec 06 22:14:46 CST 2011,devices,IX_devices

    Tue Oct 25 13:48:30 CDT 2011,devices,PK_devices

    I am going to answer the rest shortly.

  • First, how quickly does the following run:

    SELECT

    deviceid,

    archived,

    AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg,

    AVG(d2.estgasvolmcf) AS month_mcf_avg

    FROM

    devicehistory d2

    WHERE

    recorddate > DATEADD(day, -30, getDate())

    GROUP BY

    deviceid,

    archived

    Second, is this part of a stored procedure or a stand alone query?

  • 7) does the query plan you provided come from the same query that appears in your post?

    yes

    8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?

    its actually 1600+, Ive attached one for 720 devices

    9) does the query with several deviceid's run faster than the one with 16,000? yes

    10) what's the difference in runtimes between the two?

    yes, for 12 devices, it takes about 6 seconds, and for 720 devices it takes about 2 minutes

  • Lynn Pettis (5/1/2012)


    First, how quickly does the following run:

    SELECT

    deviceid,

    archived,

    AVG(CASE WHEN recorddate > DATEADD(day, -7, getDate()) THEN d2.estgasvolmcf ELSE NULL END) AS week_mcf_avg,

    AVG(d2.estgasvolmcf) AS month_mcf_avg

    FROM

    devicehistory d2

    WHERE

    recorddate > DATEADD(day, -30, getDate())

    GROUP BY

    deviceid,

    archived

    Second, is this part of a stored procedure or a stand alone query?

    that query runs in less than a second, and this is a standalone query, not stored procedure.

  • RE:

    8) if so, then what does the query plan look like when your WHERE IN has 16,000 deviceid's?

    its actually 1600+, Ive attached one for 720 devices

    I don't see the attachment for 720 devices.

  • When providing the actual execution, instead of the XML version, save the graphical version as a .sqlplan file and attach that to your post.

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply