Using Parameters in a Query is Killing Performance

  • ScottPletcher (2/19/2016)


    Log tables are almost always best clustered on a datetime, and virtually never on an identity value.

    When it's an insert datetime, I think that this guess would be right more often then it'd be wrong.

    In this case, since the column name has "upsert", it sounds as if the rows in the table are updated and this column is the timestamp of the insert or last update. Depending on the update frequency this might actually be a terrible candidate for a clustering key.

    Anyway, for this thread it's a moot point anyway. The OP cannot change the indexes, and given the set of indexes that is sad news.

    @Loner, I have seen your post with the actual plan, I will take a look.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (2/19/2016)


    ScottPletcher (2/19/2016)


    Log tables are almost always best clustered on a datetime, and virtually never on an identity value.

    When it's an insert datetime, I think that this guess would be right more often then it'd be wrong.

    In this case, since the column name has "upsert", it sounds as if the rows in the table are updated and this column is the timestamp of the insert or last update. Depending on the update frequency this might actually be a terrible candidate for a clustering key.

    Anyway, for this thread it's a moot point anyway. The OP cannot change the indexes, and given the set of indexes that is sad news.

    @Loner, I have seen your post with the actual plan, I will take a look.

    Yes, I'm far more familiar with an INSERT dttm. But even an upsrt dttm might not be a terrible clus key, even with the delete and re-insert. Keep in mind that SELECTs typically outnumber INSERTs/UPDATEs by multiple orders of magnitude. Sometimes too much emphasis is placed on avoiding some mild fragmentation at the cost of large numbers of added covering indexes which actually cost much more overall overhead.

    Also, yypically reorg/rebuild is available when needed, to reduce the fragmentation issue. Now, for this particular table, given the row count, it would probably need partitioned in order to be able to do more frequent reorg/rebuild of new data, and it would also allow full compression of old data.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Loner (2/19/2016)


    Here is the actual execution plan with option recompile.

    Thanks

    Hi Loner,

    I do see that the recompile hint worked. The estimated rowcount for the cnsmr_accnt_ar_log table has gone down to 56,827 and the shape of the plan has changed to match the plan you got when using hardcoded values.

    Unfortunately, this does not explain why the query still runs slow.

    Additionally, the upper input to the join in the plan (produced by joining #TempUser to dbo.usr) is empty. There are 21 rows in #TempUser, but none of them finds a match in dbo.usr, so no rows result.

    Because the hash match join between this combined set and the cnsmr_accnt_ar_log table reads the top part first, it "knows" that there are no rows in the top input - so there is no reason to execute the lower input, all rows will be rejected anyway. In the actual plan, I can indeed see that the read operations on the cnsmr_accnt_ar_log table have not been executed. The actual row counts are all zero. This does not mean that your table has no rows that match the filter, it means that the operator never even started to work, as can be seen from the zero in the actual executions counter.

    This makes it even harder to understand why the query still runs for such a long time.

    I wish I could help you more, but at this point I am baffled myself. I hope someone else sees past my blind spot here.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • ScottPletcher (2/19/2016)


    Sometimes too much emphasis is placed on avoiding some mild fragmentation at the cost of large numbers of added covering indexes which actually cost much more overall overhead.

    In this case, fragmentation is not my primary concern. I am more worried about the cost of updating all nonclustered indexes whenever the clustered key changes.

    But let's no longer hijack Loner's thread.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Loner (2/17/2016)


    Ok, I give up, someone please help! We have a query that runs just fine with hard coded values but once we swap them out for parameters the query runs with no end in sight.

    This version consistently returns 8,429 rows in less than 2 seconds:

    SELECT username, firstname, lastname

    FROM Users

    WHERE UpdateDt BETWEEN '2/11/2016' AND '2/12/2016';

    This version will run well over an hours and not return data:

    DECLARE @Dt1 DATETIME, @dt2 DATETIME;

    SET @Dt1 = '2/11/2016';

    SET @dt2 = '2/12/2016';

    SELECT username, firstname, lastname

    FROM Users

    WHERE UpdateDt BETWEEN @Dt1 AND @dt2;

    The UpdateDt column in the Users table is DATETIME so I wouldn't expect any datatype conversion issues.

    Any theories on what could be causing this?

    I am making an assumption that this code may actually be used in a stored procedure where the start and end dates will be passed into the procedure.

    Here is a hack that may work.

    d e c l a r e @SQLCmd nvarchar(max), -- spaces added to allow code to post

    @Dt1 datetime,

    @dt2 datetime;

    set @SQLCmd = N'

    SELECT username, firstname, lastname

    FROM Users

    WHERE UpdateDt BETWEEN ''' + convert(varchar(8), @Dt1, 112) + N''' AND ''' +convert(varchar(8), @dt2, 112) + N''';

    ';

    exec sys.sp_executesql @SQLCmd;

  • Hugo Kornelis (2/19/2016)


    ScottPletcher (2/19/2016)


    You should cluster this table on upsrt_dttm instead.

    How can you possibly determine that this should be a good clustering key based on seeing only one single query that uses the table and not knowing anything about updates?

    Because it's a temporal audit table/log.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yes the query is part of the stored procedure.

  • Loner (2/19/2016)


    Here is the actual execution plan with option recompile.

    Thanks

    Something is a little crazy with the second query. It's returning 0 rows. Is that intentional?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No it is just the data problem.

  • It's really difficult to help by looking at the execution plan when the expected result set is zero rows. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Worse, all this query tuning is likely effectively wasted anyway since the clus index is not being correct first. As soon as another column is added to this query, this "tuning" cycle starts all over again, re-writing the query and/or adjusting the covering index(es).

    The only way to genuinely solve this problem is to change the clustering index. After that's been done, it's likely the non-clus indexes could be dropped, assuming that a date/time range is (almost) always supplied, even when querying by the "consmr_id" (iirc, or whatever the most common nonclus index first key column was named). Thus, the correct clus index improves overall performance that much more.

    Now, I realize that on a table with as many rows as this one has, that's a huge change. So definitely verify first that the new approach produces better query results.

    If they won't change the clus index, them to expect sub-par performance on queries against this data, and that even when queries get tuned, such performance problems will occur again and again.

    Edit: Reworded for clarity.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Loner (2/19/2016)


    As I said before, the database is created from the software company, I cannot create any index.

    When a faulty part or design fault is discovered in certain model of Toyota cars - they recall the whole range of the model and fix the fault on their expense.

    I wonder - why isn't the same approach applied for faulty designs in software?

    Why that "software company" does not carry all the usual obligations any supplier of any other kind of product would do?

    I guess because the IT industry is still in its nappies.

    _____________
    Code for TallyGenerator

  • If you cannot change the indexes than you can maximise the effect of existing ones.

    I may assume that ID's are generally inline with upsrt_dttm.

    There might be some deviations, but they mostly follow each other.

    This allows us to replace Key Lookup (which takes all the time) with Clustered Index Seek:

    DECLARE @MinID INT, @MaxID INT

    SELECT @MinID = MIN(cnsmr_accnt_ar_log_id), @MaxID = MAX(cnsmr_accnt_ar_log_id)

    FROM [cnsmr_accnt_ar_log]

    WHERE upsrt_dttm >= @Date1 AND upsrt_dttm < @Date2

    SELECT ....

    FROM [cnsmr_accnt_ar_log] ar

    INNER JOIN ....

    WHERE upsrt_dttm >= @Date1 AND upsrt_dttm < @Date2 -- this way is more robust than BETWEEN with dateadd(ss, -1, ...)

    AND cnsmr_accnt_ar_log_id BETWEEN @MinID AND @MaxID

    The range selected by ID's might be slightly wider than the range selected by dates, therefore you need to keep datetime selection condition in the WHERE clause.

    But it will be applied to the small subset of rows within the ID range.

    If you want to keep everything in a single query you may use a derived table or a CTE:

    SELECT ....

    FROM dbo.[cnsmr_accnt_ar_log] ar

    INNER JOIN (SELECT MinID = MIN(cnsmr_accnt_ar_log_id), MaxID = MAX(cnsmr_accnt_ar_log_id)

    FROM dbo.[cnsmr_accnt_ar_log]

    WHERE upsrt_dttm >= @Date1 AND upsrt_dttm < @Date2

    ) R ON ar.cnsmr_accnt_ar_log_id BETWEEN R.MinID AND R.MaxID

    INNER JOIN ....

    WHERE ar.upsrt_dttm >= @Date1 AND ar.upsrt_dttm < @Date2 -- this way is more robust than BETWEEN with dateadd(ss, -1, ...)

    _____________
    Code for TallyGenerator

  • Sergiy (2/23/2016)


    Loner (2/19/2016)


    As I said before, the database is created from the software company, I cannot create any index.

    When a faulty part or design fault is discovered in certain model of Toyota cars - they recall the whole range of the model and fix the fault on their expense.

    I wonder - why isn't the same approach applied for faulty designs in software?

    Why that "software company" does not carry all the usual obligations any supplier of any other kind of product would do?

    I guess because the IT industry is still in its nappies.

    This is where the second meaning for "It Depends" comes from... every time I use third party software, I wish I were wearing nappies. 😀 Probably the source of the term "Crap Code", as well. :hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 46 through 58 (of 58 total)

You must be logged in to reply to this topic. Login to reply