Query Performance - MIN vs Top 1

  • Hi guys

    Similar to a previous thread I started last week about query performance I need to figure out why 2 queries, designed to do the same thing, are both returning in or around the same time as the other. One is using an Index Scan, the other is using an Index seek so I was expecting the 2nd to be quick.

    The first takes in or around 51 seconds and returns 3.5m rows

    The second takes around 47 seconds.

    The queries and execution plans are attached along with indexes I've created on the table. If anyone can weigh in with any suggestions as to what else I can do to improve the performance please let me know.

    Note: I've already reorganized the indexes on the tables

  • mitzyturbo (10/18/2016)


    Hi guys

    Similar to a previous thread I started last week about query performance I need to figure out why 2 queries, designed to do the same thing, are both returning in or around the same time as the other. One is using an Index Scan, the other is using an Index seek so I was expecting the 2nd to be quick.

    The first takes in or around 51 seconds and returns 3.5m rows

    The second takes around 47 seconds.

    The queries and execution plans are attached along with indexes I've created on the table. If anyone can weigh in with any suggestions as to what else I can do to improve the performance please let me know.

    Note: I've already reorganized the indexes on the tables

    1) I do not see query plans. Just 3 text files and the second two don't show in my browser.

    2) If you are really returning 3.5M rows to the client you cannot expect quick performance. Declare variables of the proper type for your 3 fields and then SELECT the fields into those variables. A neat trick to just time server performance on large row counts since the data never leaves the server.

    3) There is nothing that says a very large iterative seek is going to be faster than a scan. In fact I can easily construct seek plans that will perform WAY WORSE than scan plans.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I attached the two exec plans (Query1 and 2) as txt files as the UI wouldn't allow me to attach XML files

    If I've done that wrong let me know.

    This is a sub-query that I need my main query to select off, I won't be returning all 3.5 million rows. Potentially they could select ALL records requiring the query to search against the full 3.5 million rows

  • Sorry Kevin, I've deleted those txt's and added the sqlplan files now

  • The plan that uses seeks is actually quite a bit less efficient than the one that uses a scan, for a couple reasons.

    First, the seek operator that is the outer input to the nested loop join (the one on IX_EntityLogUser) gets to seek to the start of the range of rows meeting the predicate, but then scans the range.

    That's cheaper than scanning the entire index, but in this case nearly half the rows in the table are being returned anyway, so it's not so dramatically cheaper as you would expect if you were thinking singleton seek vs full scan.

    See http://sqlblog.com/blogs/paul_white/archive/2011/02/17/so-is-it-a-seek-or-a-scan.aspx for a brief investigation of how the seek operator can be either a singleton seek or a seek+range scan.

    Second, for each of the 3.6 million rows returned by that operator, a seek is performed against the IX_LogDate index. Spinning up 3.6 million seeks is quite a bit more expensive than just scanning the table in this case.

    The only reason you're seeing roughly similar durations is that the inefficient plan described above is using parallelism, while the more efficient plan with the scan is running serially.

    If the inefficient plan stayed the same except for running serially, you'd see it take substantially longer than the plan using a single scan of the table.

    Cheers!

  • That makes sense Jacob

    I had the idea that the index on the logdate field would improve performance but since it's running a select against the whole table it makes little difference compared to a table scan

    Realistically it's unlikely that users running the report that this query will be used in will use such parameters I just need to be able to gauge just how long they could be looking at if they go with the full select.

  • I can't see your indexes, can you confirm that this is one of them? If not, create it and test with both queries.

    (entity, LogTxt, entityRowID) INCLUDE (LogDate, userName)

    There's a row_number version you might like to try too...

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Sorry Chris, forgot to attach the indexes

    Happy enough to go with the TOP 1 in a CROSS APPLY

    Thanks anyway

  • mitzyturbo (10/18/2016)


    Sorry Chris, forgot to attach the indexes

    Happy enough to go with the TOP 1 in a CROSS APPLY

    Thanks anyway

    Sure no probs - but do try that index 😉

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • mitzyturbo (10/18/2016)


    Sorry Chris, forgot to attach the indexes

    Happy enough to go with the TOP 1 in a CROSS APPLY

    Thanks anyway

    Here's the ROW_NUMBER version. I think the index I posted earlier would be good for any of these queries.

    Edit: Lost the query somewhere...

    SELECT entityRowID, userName, logDate

    FROM (

    SELECT entityRowID, userName, logDate,

    rn = ROW_NUMBER() OVER(PARTITION BY entityRowID ORDER BY logDate)

    FROM dbo.ActivityLog

    WHERE entity = 'INBOX'

    AND LogTxt = 'C'

    )b

    WHERE rn = 1

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (10/18/2016)


    mitzyturbo (10/18/2016)


    Sorry Chris, forgot to attach the indexes

    Happy enough to go with the TOP 1 in a CROSS APPLY

    Thanks anyway

    Here's the ROW_NUMBER version. I think the index I posted earlier would be good for any of these queries.

    Where?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (10/18/2016)


    ChrisM@Work (10/18/2016)


    mitzyturbo (10/18/2016)


    Sorry Chris, forgot to attach the indexes

    Happy enough to go with the TOP 1 in a CROSS APPLY

    Thanks anyway

    Here's the ROW_NUMBER version. I think the index I posted earlier would be good for any of these queries.

    Where?

    Thanks Phil, got interrupted twice whilst posting that!

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply