Particular tables cause query performance problems

  • I'm part of a small startup developing a web application that includes a reporting component where we can build reports graphically, which get translated into SQL select statements. In true entrepreneurial fashion, i'm in over my head in terms of DBA responsibilities.

    On two separate occasions, we have had reports that, when executed, caused the SQL Server process to peg the CPU and the query to take a loong time to complete (20 or 30 minutes). For both of these situations, removing references to 2 particular tables (which are related with a foreign key) caused both queries to speed up to almost instantaneous execution.

    I'm wondering what about these 2 tables could be causing this sort of slow-down, and if there's something in the design of the tables and database that would be an obvious suspect and candidate for change. When looking at the query plan for one of the queries, 92% of the Cost is on a Clustered Index Scan of 1 of these 2 tables - i'm not exactly sure what this means, or recommends.

    Any enlightenment, guidance, or direction would be very much appreciated - thanks!

    The server we're working with here looks like:

    Windows Server 2003, Standard Edition, Service Pack 2

    AMD Athlon 64 X2 Dual Core Processor 4600+ 2.41GHz

    2GB RAM

    SQL Server 2005, Service Pack 2

  • Can you post the queries, the table schemas, the index definitions and the execution plans with that table as part of the query? (save as a .sqlplan file, zip and attach)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It's a good thing i treasure humility, because i have a feeling this post will provide aplenty for me 😀

    Ok, so i'm attaching a zip file, containing:

    - ExecutionPlan.sqlplan

    - PurchaseTransactions.sql (table definition for primary key participant of the problematic duo)

    - PurchaseTransactionItems.sql (table definition for the foreign key participant of the problematic duo)

    - Query.sql (the query itself)

    - ExecutingSuspended.txt (the results of a query i did against sys.dm_exec_requests while the query was suspended)

    There are NO indexes on these two tables beyond the primary keys - i'm also planning on a post regarding strategic index planning.. also, not sure if these will come into play, but the "ASC_"-prefixed columns are leftover from a data migration that is now complete, and the plan is to get rid of them.

    Thanks so much!

  • I'll check tomorrow, if someone hasn't got there first.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I'm sorry I don't have a lot of time to look into this for you, but at first blush:

    This was obviously built by a some kind of query builder, and the queries built by those things ... are often not very good in terms of efficiency, esp. the way they 'pop out', i.e. if you try to use them 'verbatim'

    Here's a few quick tips:

    Left Outer Joins are slower than inner joins, so don't use them if you don't have to.

    The OR predicate is used so liberally in your query, and oftentimes all it takes is one really poorly placed (usually it's inappropriate to begin with) OR statement to bring a query to a crawl. You have LOTS of places that this could be the case.

    Anytime you have tables with suffixes like _1, _2, _3, etc, 99% of the time its indicative of a poor data structure. Usually, you want to have one table, with a 'flag' type of field that makes the differentiation between rows that are now in each of the tables (ON EDIT: looks like those are aliases ... may not be a problem)

    I'm not positive, but there's probably a lot of places in your where clause where you should be using a function called COALESCE. Check it out in the BOL.

    You have what appears to be a number of criteria that basically look like this:

    (([xModules_eCommerce_Inventory].[AccountID] = 1) OR ([xModules_eCommerce_Inventory].[AccountID] IS NULL))

    HOWEVER, any field you have in your db that is an ID field most likely cannot be NULL, correct? If that's the case, there's a good chance you should get rid of every check like this one in this query.

    Basically, every field that is part of a join (apart from those which are PK's or have established FK's, cause then indexes get built automatically) should have an index on it, especially if the values in those fields vary greatly. Same is true of fields that are used as criteria in your where clauses.

    Lastly, although this is not likely any big deal, the following:

    (NOT (UPPER([ParentCategoryID_xModules_eCommerce_ItemCategories_4].[Code])='99') AND NOT (UPPER([ParentCategoryID_xModules_eCommerce_ItemCategories_4].[Code])='96') AND NOT (UPPER([ParentCategoryID_xModules_eCommerce_ItemCategories_4].[Code])='60'))

    Should be something like (where) UPPER([ParentCategoryID_xModules_eCommerce_ItemCategories_4].[Code] NOT IN ('99', '96', '60')

    Oh, one other thing I'd do: Get rid of your WHERE clause altogether, then wrap the remainder in a count statement, and see how long it takes to run. Then add the where clause back in, and see how long that takes to run. This should give you a decent place to start ... is the problem your joins, or is it your where clause?

    To wrap it in a count, do this

    Select Count(a.*) from (YOUR QUERY HERE) a

  • Alright, I decided to invest a bit more time, and took a closer look at the execution plan for you. I'm about 99% sure the problem lies somewhere in one of these two 'pieces':

    I)

    LEFT OUTER JOIN [xModules_eCommerce_PurchaseTransactionItems] AS [ItemID_xModules_eCommerce_PurchaseTransactionItems_3]

    ON

    [ItemID_xModules_eCommerce_PurchaseTransactionItems_3].[ItemID]=[ItemID_xModules_eCommerce_Items_Descriptive_2].[ItemID]

    II)

    WHERE

    NOT

    [ItemID_xModules_eCommerce_PurchaseTransactionItems_3].[PurchaseTransactionItemID] IS NULL

    AND

    (

    [xModules_eCommerce_Inventory].[InventoryID] IS NULL

    OR[InventoryID_xModules_eCommerce_InventoryItems_1].[ItemID] IS NULL

    OR[ItemID_xModules_eCommerce_Items_Descriptive_2].[ItemID] IS NULL

    OR[ItemID_xModules_eCommerce_PurchaseTransactionItems_3].[AccountID] = 1

    OR[ItemID_xModules_eCommerce_PurchaseTransactionItems_3].[AccountID] IS NULL

    )

    That particular where clause ... is a nightmare, to begin with.

    Also, it looks to me like what you're looking for is like a 1 row 'answer', correct? Well, according to (how I'm reading) the execution plan, that join in I) above is ballooning your result set by like 700,000+ X that 1 row you want back.

    So:

    1) You don't even use that descriptive_2 table in your Select statement. Thus, you probably don't need that whole join.

    2) If you REALLY do need that join, you probably actually want an inner join.

    3) Once you do the inner join, you may not even need that where clause at all.

    4) A heck of a lot of the excess complexity of this query is likely being caused by the outer joins. All those IS NULL and OR's ... the probably would not have to be there if you did inner joins.

    If you're looking for one (or even 'a few') row(s) based on specific criteria, 99% of the time, there should not be any outer joins in your query.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply