How to improve performance of a dumb query?

  • I have been asked to help with performance issue of a SQL server installation. I am not a SQL Server expert, but I decided to take a look. We are using a closed source application that appears to work OK. However after a SQL Server upgrade from 2000 to 2005, application performance has reportedly suffered considerably. I ran SQL profiler and caught the following query (field names changed to protect the innocent) taking about 30 seconds to run. My first thought was that I should optimize the query. But that is not possible, given that the application is closed source and the vendor is not helpful. So I am left, trying to figure out how to make this query run fast without changing it. It is also not clear to me how this query ran faster on the older SQL server 2000 product. Perhaps there was some sort of performance tuning applied to on that instance that did not carry over or does not work on the new SQL server. DBCC PINTABLE comes to mind.

    Anyway, here is the offending query:

    select min(row_id) from Table1 where calendar_id = 'Test1'

    and exists

    (select id from Table1 where calendar_id = 'Test1' and

    DATEDIFF(day, '12/30/2010 09:21', start_datetime) = 0

    )

    and exists

    (select id from Table1 where calendar_id = 'Test1' and

    DATEDIFF(day, end_datetime, '01/17/2011 09:03') = 0

    );

    Table1 has about 6200 entries and looks like this. I have tried creating various indices to no effect.

    id calendar_id start_datetime end_datetime

    int, primary key varchar(10) datetime datetime

    1 Test1 2005-01-01... 2005-01-01...

    2 Test1 2005-01-02... 2005-01-02...

    3 Test1 2005-01-03... 2005-01-03...

    ...

    I would be very grateful if somebody could help resolve this mystery.

    Thanks in advance.

  • The first and most likely cause of a slowdown after a version upgrade in SQL Server is table statistics.

    Take all the tables in the offending query and rebuild the statistics WITH FULLSCAN.

    Always step one after an upgrade.

    You'll probably find the entire system runs better after this.

    Past that, we'll probably need to look at the execution plan to help you nail down the specific problems. See the tuning link in my sig if you need assistance with that.

    EDIT: Other note I forgot to include on my first pass... PINTABLE was more important when server memory was VERY small, like desktop small. These days most small tables you would have pinned are perma-pinned due to continual access anyway, and thus superfluous except to help in the toe removal category of shooting oneself. 🙂


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I had already tried UPDATE STATISTICS with no benefit. I tried the following again:

    UPDATE STATISTICS Table1

    WITH FULLSCAN, NORECOMPUTE

    And verified that it has no impact on performance of the bad query.

    I will soon post some information about the query plan.

  • Portlander (2/9/2011)


    I had already tried UPDATE STATISTICS with no benefit. I tried the following again:

    UPDATE STATISTICS Table1

    WITH FULLSCAN, NORECOMPUTE

    And verified that it has no impact on performance of the bad query.

    I will soon post some information about the query plan.

    From BOL:

    NORECOMPUTE

    Disable the automatic statistics update option, AUTO_UPDATE_STATISTICS, for the specified statistics. If this option is specified, the query optimizer completes this statistics update and disables future updates.

    To re-enable the AUTO_UPDATE_STATISTICS option behavior, run UPDATE STATISTICS again without the NORECOMPUTE option or run sp_autostats.

    Don't do the NORECOMPUTE, it'll eventually make your life miserable.

    Auto-shrink is bad. Auto-statistics does no harm.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I am trying this out on a development machine with no other users besides me and no inserts or updates going to the single table database. I figured NORECOMPUTE was safe to use here.

  • I just took a second look at your code. Okay, that is a significantly STRANGE clause.

    and exists

    (select id from Table1 where calendar_id = 'Test1' and

    DATEDIFF(day, '12/30/2010 09:21', start_datetime) = 0

    )

    Alright, so, what this really says is SELECT 1 FROM Table1 WHERE calendar_id = 'Test1' AND start_datetime >= '12/30/2010' AND start_datetime < DATEADD( dd, 1, '12/30/2010')

    Rewrite those two where clauses that way so they're SARGable, and you'll most likely see much better performance, as long as you have an index on start_datetime and end_datetime.

    However, exists is a funny thing. Any row satisfying that condition lets you use the ENTIRE table. It's not an id IN, it's simply an exists... and doesn't do anything with the primary table.

    I'd really like to see the full proc and ddl here, this thing is a tragedy.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I made a small typo in the original post. The very first line of the SQL statement should start with:

    select min(id)... instead of

    select min(row_id)

    Anyway, here is the query plan. As we can see we suffer from nested loops. Unfortunately nothing I can do about the query. I have to attack this problem from the wrong side.

    |--Stream Aggregate(DEFINE:([Expr1009]=MIN([testdb].[Table1].[id])))

    |--Nested Loops(Left Semi Join)

    |--Nested Loops(Left Semi Join)

    | |--Index Seek(OBJECT:([testdb].[Table1].[IX_CAL_WORK_DAYS_2]), SEEK:([testdb].[Table1].[CAL_calendar_id]='Test1') ORDERED FORWARD)

    | |--Clustered Index Scan(OBJECT:([testdb].[Table1].[PK_CAL_WORK_DAYS]), WHERE:([testdb].[Table1].[CAL_calendar_id]='Test1' AND datediff(day,'2010-12-30 09:21:00.000',[testdb].[Table1].[start_datetime])=(0)))

    |--Clustered Index Scan(OBJECT:([testdb].[Table1].[PK_CAL_WORK_DAYS]), WHERE:([testdb].[Table1].[CAL_calendar_id]='Test1' AND datediff(day,[testdb].[Table1].[end_datetime],'2011-01-17 09:03:00.000')=(0)))

  • |--Stream Aggregate(DEFINE[Expr1009]=MIN([testdb].[Table1].[id])))

    |--Nested Loops(Left Semi Join)

    |--Nested Loops(Left Semi Join)

    | |--Index Seek(OBJECT[testdb].[Table1].[IX_CAL_WORK_DAYS_2]), SEEK[testdb].[Table1].[CAL_calendar_id]='Test1') ORDERED FORWARD)

    | |--Clustered Index Scan(OBJECT[testdb].[Table1].[PK_CAL_WORK_DAYS]), WHERE[testdb].[Table1].[CAL_calendar_id]='Test1' AND datediff(day,'2010-12-30 09:21:00.000',[testdb].[Table1].[start_datetime])=(0)))

    |--Clustered Index Scan(OBJECT[testdb].[Table1].[PK_CAL_WORK_DAYS]), WHERE[testdb].[Table1].[CAL_calendar_id]='Test1' AND datediff(day,[testdb].[Table1].[end_datetime],'2011-01-17 09:03:00.000')=(0)))

    Can you describe the indexing on Table1? Specifically, what is the definition for PK_CAL_WORK_Days and are there any NC indexes available? You mentioned earlier you modified some indexes, is this allowable as a final solution?

    In particular, I would create two new indexes, one leading on start_datetime, another on end_datetime, and add in cal_calendar_id as the second column. Add to this changing the where clauses to be SARGable and you would clean up 90% of your pain.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Craig Farrell (2/9/2011)


    I'd really like to see the full proc and ddl here, this thing is a tragedy.

    No arguments there. Hence the subject (How to improve performance of a dumb query).

    Unfortunately I cannot edit the query. Its buried in compiled code from a vendor who is not very helpful. I am trying to buy us some time by trying to attack the problem from the database side. It is also true that this tragedy of a query used to run reasonably fast with SQL Server 2000. Perhaps there were some optimizations applied to that instance that helped it. I am not sure.

    I have attempted to isolate the performance issues by copying the single table to a brand new installation of SQL server 2008 and removing all the columns that I don't need. Now there is nothing in this db, but this simple table and the select query and me trying to figure out if I can speed things up without editing the query.

    CREATE TABLE [Table1](

    [id] [int] NOT NULL,

    [calendar_id] [varchar](10) NULL,

    [start_datetime] [datetime] NULL,

    [end_datetime] [datetime] NULL,

    CONSTRAINT [PK_CAL_WORK_DAYS] PRIMARY KEY CLUSTERED

    (

    [id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = ON, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

  • Portlander (2/9/2011)


    Craig Farrell (2/9/2011)


    I'd really like to see the full proc and ddl here, this thing is a tragedy.

    No arguments there. Hence the subject (How to improve performance of a dumb query).

    Unfortunately I cannot edit the query. Its buried in compiled code from a vendor who is not very helpful. I am trying to buy us some time by trying to attack the problem from the database side. It is also true that this tragedy of a query used to run reasonably fast with SQL Server 2000. Perhaps there were some optimizations applied to that instance that helped it. I am not sure.

    Please bear with my slightly sleep deprived brain at the moment.

    This tends to be a crappy situation to be in. Only allowed a few methods of optimization yet expected to keep the system at full capacity. As mentioned and you agreed... a tragedy. :angry:

    There are a few ways we can help you without direct editing of the procedure, however, it requires knowing what other options are available to you.

    If you'll notice my post above, it describes a few indexes that might get you out of this situation, but I'm not sure if permanent additional indexes are allowable to you. If not, we might be able to recommend some Plan Query or additional alternatives that will allow it.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Bummer ,

    You may be able to ease your system usage by creating a plan guide and forced paramaterization.

    http://msdn.microsoft.com/en-us/library/ms191275.aspx

    Also post the full plan as per

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    It may help to see what the estimated versus actual row counts are, then again it may not



    Clear Sky SQL
    My Blog[/url]

  • Can you post the entire execution plan as a .sqlplan file please? http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • 1) get rid of functions around columns in exists clauses

    2) refactor exists into single exists

    3) do not use explicit field for exists (unless NULLable issues are afoot) - use * instead.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply