VERY STRANGE QUERY

  • HI All,

    I have a query, and before I start posting schemas and samples etc, I was wondering if anyone has come across this situation?

    I have a query.

    When joined to a view the performace is the same.

    When I return 1 column from the view in my main query then slows down like a dog.

    However if I put SELECT TOP 1000 or any number greater that what my results return the query returns to normal speed !?!?

    Does this mean something obvious is wrong with my query?

    What should I be look for ?

    Thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • There has to be some major differences in the Query plan when returning data from the view. I'm surprised that the TOP operator speeds things up as according to Inside SQL Server 2005: T-SQL Querying, it is an extra step to the processing AFTER pulling all the data. TOP is the last operator applied.

  • Thanks Jack,

    Thats exactly what I thought.

    Which is why I thought I would ask, but just so you know it's 10 times fast with the TOP clause in place.

    I'll keep looking into the view and post if I find anything useful 🙂

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Post the execution plans, I'd be interested in seeing how SQL Server is pulling this off!

  • Whats the best way to post that on in here?

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Hi Jack,

    Here is a spreadsheet, let me know if you come see anything...

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Sorry I did not see the first post wher you ask the best way to post the Execution plans. My preference is to save the graphical execution plan in SSMS, zip them and attach them. Mainly because I find the graphical plans easier to follow. What you have posted works though. I am re-attaching it with the areas highlighted that are different. For some reason, fragmentation would be my best guess, you are getting a lot fewer reads on [sorcerer].[dbo].[_69Claim] by processing all the rows. I'm really not an expert at dissecting query plans, but all those extra reads are killing you. I'd look at my indexing on that table to see if I could create a non-clustered index that will eliminate the clustered index scan.

  • Thanks Jack,

    The strange thing is this.

    In the fast sheet where you have indicated in yellow they are all scans.

    In the slow sheet where you have indicated in yellos they are all seeks.

    Thats even stranger, cause seeks should be fast than scans I'm pretty sure of that.

    I'll keep at the indexes and see what I find.

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • That's why I'm wondering about fragmentation. If the index is fragmented then seeking is making you jump all over the place in the chain whereas scanning is running as many in a row as possible before jumping around.

  • Sounds strangely familiar. You probably have a clustered index on an ascending value (datetime or identity or whatever). On a large table, stats on these indexes are flagged for update when the formula threshold is reached (500 + 20% of the total number of column rows changed). Unfortunately this formula can't be modified.

    20% of a large table can be quite a lot. Which means new values fall outside of the statistics histogram, and the stat isn't flagged for update (not considered out of date). When you use operands like TOP, it specifically uses these histograms to determine query plan execution.

    Chances are, whatever included in your TOP clause just so happens to be included in the histogram.

    Update all the statistics on the table and try your query again using the default scan density. If that doesn't work, use a scan density of 50% and try again.

  • You guys are my hero's 🙂

    OK I update the stats on my tables and this little baby, is humming like a beauitful tropical bird 🙂

    Ok so I guess this means I am forced to ask for resources to read about statistics?

    as I need a few questions answered:

    1- Why did auto stat updates not take care of this? How much of a problem is this in the future?

    2- Is it bad to manually update stats after major bulk insert/updates?

    3- In my company we do loads of bulk insert and updates, should we infact leave SQL to look after the stats or should we update them after these update/inserts?

    Thakns again for all your help 🙂

    Thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • I think Jeremy's post really answers your first question. The formula for autoupdate was probably not being reached. AS far as your next 2 questions, I don't have a lot of experience with Bulk Loads, but what I have read is that, especially when you are emptying and the re-loading a table, it may work best to drop the indexes, load the data, and then add the indexes. If you aren't doing a purge/load cycle, but adding a lot of records and doing it off hours I don't think it would hurt to do update the stats and maybe even rebuild the indexes (reduce fragmentation). An index rebuild will update the stats as well.

  • Thanks Jack that is great help.

    I'm going start reading up about indexes fragmentation now 😉

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Yes basically the deal with auto update stats is that as your table gets bigger, the number of column changes required to trigger the auto update gets larger as well. Without getting into specifics, the general formula SQL Server uses to trigger an automatic update of the stats is 500 changes to the column plus 20% of the total rowcount.

    This means that you probably will have to issue an UPDATE STATISTICS command manually after a large insert on a table with many records, even with auto update turned on so that the records you just inserted can be included in the heuristics. 500 + 20% of the total rowcount on a very large table can be a really high number. Probably a lot higher than the number of records you just inserted / modified.

    When you update stats, SQL Server samples your table data in the exact order that its stored on disk. So if you have an ascending clustered index like a date field, and you insert a bunch of "new" records with "new" date fields, SQL Server throws those records at the "end" of the table in the physical disk and the stats give the optimizer no clues on whether your data can be located by that clustered index or not because there's no samples of it within the heuristics (high range).

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply