Query Performance

  • Hello All..

    Let me know if anyone feels this should be expected. The performance doesn't feel right to me.

    Here is my actual SQL. (BTW, i'm testing so don't yell at me for the hard dates or inputs!!!!)

    I hope i didn't make it difficult to read......i inserted **'s at the problem area.

    When i run this without the sums, the query runs in 4-10 seconds.

    With the sums it goes to 4 1/2 to 5 minutes. Both queries return the same number of rows and would be summing FROM the same number of records.

    Does this seem right to all the smart people out there?? (I haven't made it to that list yet!)

    Thanks for looking - Peter

    ======================================================

    declare @FromDt as datetime

    declare @ToDt as datetime

    set @FromDt = '12/19/09'

    set @ToDt = '12/19/09';

    WITH ctTrans (TRANS_ID) AS

    (

    SELECT DISTINCT TRANS_ID FROM TRN.TRNDTL_SPR09 AS TD INNER JOIN

    MSDM.MERCH.PDT_USRTL AS P ON TD.SKU_NUM = P.SKU_NUMBER WHERE P.PDT_CLASS_NUMBER = 965

    AND TD.WK_END_DT BETWEEN @FromDt AND @ToDt

    ),

    ctClass (TRANS_ID, PDT_ID, PDT_DSCR, PDT_UNITS) AS

    (

    SELECT

    TD.TRANS_ID, P.PDT_CLASS_NUMBER AS PDT_ID, P.PDT_CLASS_DSCR AS PDT_DSCR, ****SUM(SLS_UNITS) AS PDT_UNITS****

    FROM

    TRN.TRNDTL_SPR09 AS TD INNER JOIN

    MSDM.MERCH.PDT_USRTL AS P ON

    TD.SKU_NUM = P.SKU_NUMBER

    WHERE

    TD.WK_END_DT BETWEEN @FromDt AND @ToDt AND PDT_DEPT_NUMBER NOT IN (46, 49)

    GROUP BY

    TD.TRANS_ID, P.PDT_CLASS_NUMBER, P.PDT_CLASS_DSCR

    )

    SELECT ctClass.PDT_ID, COUNT(ctClass.TRANS_ID) as ClassCount, ****SUM(ctClass.PDT_UNITS) as PDT_UNITS****

    FROM ctClass INNER JOIN ctTrans ON

    ctClass.Trans_ID = ctTrans.Trans_Id

    GROUP BY ctClass.PDT_ID

    ORDER BY COUNT(ctClass.TRANS_ID) DESC

  • Looks like you are using a linked server. Are the summed columns on the linked server table? If so, I;d recommend doing this a bit differently so that as much processing is down remotely as possible. If you can make 1 linked server call that returns the data already summed (as close as you can get to the final results) and inserts into a temp table and then use the temp table to join to your local table. I'd be willing to bet you see a vast improvement.

  • Another comment from me:

    I am just starting to use CTE's. I find they are easier to manage and perform better. This query started as a table joinded to a query with another query in my where clause.

    When i run the query in that format, it runs closer to 2 minutes vs. the 4-5.

    I guess my real question is - From experience, CTE's -OR- subqueries and joined queries??

    And I'm sure the answer - It depends!!!!

    Thanks again.

  • Hi Jack

    No, not a linked server. Tables live directly in a database on the SQL Server.

    Check out my comment. Looks like we posted about the same time.

    I'm migrating to common table expressions and away from temp tables and subqueries. This just might be one of those instances where the CTE is less effecient.

  • Oops my bad, I saw interpreted the underscore in the second table name as a period and thought I saw 4-part naming.

    Can you post execution plans (graphical)? That will help.

  • Please post table definitions, index definitions and execution plan, as per http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • i have not posted a plan before.....what's the best way to do that??

  • Just saw Gila's

    I'll follow the instructions and post.

  • table: TRN.TRNDTL_SPR09

    trans_id

    loc_num

    sku_num

    sls_units

    sls_amt

    mgn_amt

    cld_dt

    wk_end_dt

    PK- trans_id, sku_num

    IDX - one each on: trans_id; cld_dt; loc_num; sku_num; wk_end_dt;

    execution plan attached.

  • Thanks, but you have still left us with incomplete information as you only included the schema for one of the tables involved in the query. It would also be helpful to have the query plan form the fast query as well. Here are a few comments on what you provided thus far:

    1. Does this

    IDX - one each on: trans_id; cld_dt; loc_num; sku_num; wk_end_dt;

    mean that you 5 single column indexes? Those are typically not the most useful indexes.

    2. From the query plan I see that the actual rows vs. estimated rows for the first Index Seek on TRDTL_SPR09 is off by tens of millions which means you probably should update statistics on that table.

    3. From the query plan I see a lot of parallelism so you may want to try OPTION (MaxDop 1) to force it to not be parallelized. Comment 2 may help with this as well.

    4. The optimizer is suggesting an index on WK_END_DT with included columns TRANS_ID, SKU_NUM, SLS_UNITS to make an index that will cover that query. I don't know that you should definitely create it, but I do think you need to re-examine your indexes.

  • I second what Jack said on the updating of the statistics, that would certainly be your first step.

    I would also look how the join is made on the table : TRDTL_SPR09, it seems most of the execution plan is wasted there, and maybe if you could aim for a covering index, or change your join to do a clustered seek, that would certainly be your best bet.

    Most of the tables you are playing with are pretty big, but this 1 is simply huge, so it would help not having to completely scan it.

    Cheers,

    J-F

  • Hi

    #1 - thanks for the input so far guys.

    #2 - as i'm sure you figured out, i am far from a dba, and i don't even play a good one on tv. I have enough knowledge to be dangerous.

    Now for some answers. Yes, 5 single field indexes. I did that because that table is typically searched on any 5 of those fields frequently. Which brings a question from me - If there is a more frequent search including 2 of the fields say sku_num & wk_end_dt, should i create that index??

    The table is HUGE. Sales transaction table. I append once a week to it on a sunday night. I then rebuild all the indexes (which takes about 5 hours over night). ANother question - does rebuilding the index rebuild statistics?? I use ALTER INDEX etc etc etc.

    I didnt include the data on the product table for the reason its pretty simple.....about 200k rows, sku_num (PK), then some item descriptive data. No other indexes.

    I don't have the execution plan available to me at this moment for the faster running query as i'm checking my emails from home now.

    I'm not familiar with the term covering index, but i will certainly read up on it.

    thanks again.

  • You can start with this series on indexes, it may help

    http://www.sqlservercentral.com/articles/Indexing/68439/

    If you have queries often filtering on two columns then you really should add an index on those two columns. It doesn't have to be a new one, you could widen an existing index by adding another column to it.

    Rebuilding an index updates the stats on that index. Reorganising does not. It's only the stats on that index, if there are column stats they may need manual updates (or they may not, hard to tell sometimes)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • There are a couple nasty table scans in there. Can you add this index and then post the revised execution plan?

    CREATE NONCLUSTERED INDEX idx_TRNDTLSPR09_WKENDDT

    ON [TRN].[TRNDTL_SPR09] ([WK_END_DT])

    INCLUDE ([TRANS_ID],[SKU_NUM],[SLS_UNITS])

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Now that i understand the communication better, let me give the scenario the way i should have started.

    TABLE: DETAIL SALES TRANSACTIONS

    SKU_NUM - PK & single column index

    TRANS_ID - PK & single column index

    LOC_NUM - single idx

    CLD_DT - single idx

    WK_END_DT - single idx

    UNITS

    SALES

    MARGIN

    I will attach a text file that is one of the most run queries against this table.

    My thinking (right or wrong) on the individual indexes was that those are the most searched columns. Now i'm thinking i should have something different.

    You should see from the query that SKU, WK_END & TRANS_ID are in the WHERE & JOINS most of the time. So if i search for transactions that contain a certain SKU should one of my indexes be SKU with an INCLUDE on WK_END & TRANS_ID or should the index be all 3 or both.

    I know an execution plan would probably give you the answer, but i guess from experience first what would you think.

    Also, what started all of this for me was that i started using CTE's over subqueries because i started getting better performance. This particular query runs faster in the attached format.

    THANK YOU ALL FOR THE FEEDBACK

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply