Query Performance

Question

Query Performance

BAGs

Right there with Babe

Points: 793
More actions
December 21, 2009 at 12:43 pm

#143695

Hello All..
Let me know if anyone feels this should be expected. The performance doesn't feel right to me.
Here is my actual SQL. (BTW, i'm testing so don't yell at me for the hard dates or inputs!!!!)
I hope i didn't make it difficult to read......i inserted **'s at the problem area.
When i run this without the sums, the query runs in 4-10 seconds.
With the sums it goes to 4 1/2 to 5 minutes. Both queries return the same number of rows and would be summing FROM the same number of records.
Does this seem right to all the smart people out there?? (I haven't made it to that list yet!)
Thanks for looking - Peter
======================================================
declare @FromDt as datetime
declare @ToDt as datetime
set @FromDt = '12/19/09'
set @ToDt = '12/19/09';
WITH ctTrans (TRANS_ID) AS
(
SELECT DISTINCT TRANS_ID FROM TRN.TRNDTL_SPR09 AS TD INNER JOIN
MSDM.MERCH.PDT_USRTL AS P ON TD.SKU_NUM = P.SKU_NUMBER WHERE P.PDT_CLASS_NUMBER = 965
AND TD.WK_END_DT BETWEEN @FromDt AND @ToDt
),
ctClass (TRANS_ID, PDT_ID, PDT_DSCR, PDT_UNITS) AS
(
SELECT
TD.TRANS_ID, P.PDT_CLASS_NUMBER AS PDT_ID, P.PDT_CLASS_DSCR AS PDT_DSCR, ****SUM(SLS_UNITS) AS PDT_UNITS****
FROM
TRN.TRNDTL_SPR09 AS TD INNER JOIN
MSDM.MERCH.PDT_USRTL AS P ON
TD.SKU_NUM = P.SKU_NUMBER
WHERE
TD.WK_END_DT BETWEEN @FromDt AND @ToDt AND PDT_DEPT_NUMBER NOT IN (46, 49)
GROUP BY
TD.TRANS_ID, P.PDT_CLASS_NUMBER, P.PDT_CLASS_DSCR
)
SELECT ctClass.PDT_ID, COUNT(ctClass.TRANS_ID) as ClassCount, ****SUM(ctClass.PDT_UNITS) as PDT_UNITS****
FROM ctClass INNER JOIN ctTrans ON
ctClass.Trans_ID = ctTrans.Trans_Id
GROUP BY ctClass.PDT_ID
ORDER BY COUNT(ctClass.TRANS_ID) DESC

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184397 More actions · Answer 1

Looks like you are using a linked server. Are the summed columns on the linked server table? If so, I;d recommend doing this a bit differently so that as much processing is down remotely as possible. If you can make 1 linked server call that returns the data already summed (as close as you can get to the final results) and inserts into a temp table and then use the temp table to join to your local table. I'd be willing to bet you see a vast improvement.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

BAGs Right there with Babe Points: 793 More actions · Answer 2

Another comment from me:

I am just starting to use CTE's. I find they are easier to manage and perform better. This query started as a table joinded to a query with another query in my where clause.

When i run the query in that format, it runs closer to 2 minutes vs. the 4-5.

I guess my real question is - From experience, CTE's -OR- subqueries and joined queries??

And I'm sure the answer - It depends!!!!

Thanks again.

BAGs Right there with Babe Points: 793 More actions · Answer 3

Hi Jack

No, not a linked server. Tables live directly in a database on the SQL Server.

Check out my comment. Looks like we posted about the same time.

I'm migrating to common table expressions and away from temp tables and subqueries. This just might be one of those instances where the CTE is less effecient.

Jack Corbett SSC Guru Points: 184397 More actions · Answer 4

Oops my bad, I saw interpreted the underscore in the second table name as a period and thought I saw 4-part naming.

Can you post execution plans (graphical)? That will help.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 5

Please post table definitions, index definitions and execution plan, as per http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

BAGs Right there with Babe Points: 793 More actions · Answer 6

i have not posted a plan before.....what's the best way to do that??

BAGs Right there with Babe Points: 793 More actions · Answer 7

Just saw Gila's

I'll follow the instructions and post.

BAGs Right there with Babe Points: 793 More actions · Answer 8

table: TRN.TRNDTL_SPR09

trans_id

loc_num

sku_num

sls_units

sls_amt

mgn_amt

cld_dt

wk_end_dt

PK- trans_id, sku_num

IDX - one each on: trans_id; cld_dt; loc_num; sku_num; wk_end_dt;

execution plan attached.

Jack Corbett SSC Guru Points: 184397 More actions · Answer 9

Thanks, but you have still left us with incomplete information as you only included the schema for one of the tables involved in the query. It would also be helpful to have the query plan form the fast query as well. Here are a few comments on what you provided thus far:

1. Does this

IDX - one each on: trans_id; cld_dt; loc_num; sku_num; wk_end_dt;

mean that you 5 single column indexes? Those are typically not the most useful indexes.

2. From the query plan I see that the actual rows vs. estimated rows for the first Index Seek on TRDTL_SPR09 is off by tens of millions which means you probably should update statistics on that table.

3. From the query plan I see a lot of parallelism so you may want to try OPTION (MaxDop 1) to force it to not be parallelized. Comment 2 may help with this as well.

4. The optimizer is suggesting an index on WK_END_DT with included columns TRANS_ID, SKU_NUM, SLS_UNITS to make an index that will cover that query. I don't know that you should definitely create it, but I do think you need to re-examine your indexes.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

J-F Bergeron SSCrazy Eights Points: 8593 More actions · Answer 10

I second what Jack said on the updating of the statistics, that would certainly be your first step.

I would also look how the join is made on the table : TRDTL_SPR09, it seems most of the execution plan is wasted there, and maybe if you could aim for a covering index, or change your join to do a clustered seek, that would certainly be your best bet.

Most of the tables you are playing with are pretty big, but this 1 is simply huge, so it would help not having to completely scan it.

Cheers,

J-F

BAGs Right there with Babe Points: 793 More actions · Answer 11

Hi

#1 - thanks for the input so far guys.

#2 - as i'm sure you figured out, i am far from a dba, and i don't even play a good one on tv. I have enough knowledge to be dangerous.

Now for some answers. Yes, 5 single field indexes. I did that because that table is typically searched on any 5 of those fields frequently. Which brings a question from me - If there is a more frequent search including 2 of the fields say sku_num & wk_end_dt, should i create that index??

The table is HUGE. Sales transaction table. I append once a week to it on a sunday night. I then rebuild all the indexes (which takes about 5 hours over night). ANother question - does rebuilding the index rebuild statistics?? I use ALTER INDEX etc etc etc.

I didnt include the data on the product table for the reason its pretty simple.....about 200k rows, sku_num (PK), then some item descriptive data. No other indexes.

I don't have the execution plan available to me at this moment for the faster running query as i'm checking my emails from home now.

I'm not familiar with the term covering index, but i will certainly read up on it.

thanks again.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 12

You can start with this series on indexes, it may help

http://www.sqlservercentral.com/articles/Indexing/68439/

If you have queries often filtering on two columns then you really should add an index on those two columns. It doesn't have to be a new one, you could widen an existing index by adding another column to it.

Rebuilding an index updates the stats on that index. Reorganising does not. It's only the stats on that index, if there are column stats they may need manual updates (or they may not, hard to tell sometimes)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 13

There are a couple nasty table scans in there. Can you add this index and then post the revised execution plan?

CREATE NONCLUSTERED INDEX idx_TRNDTLSPR09_WKENDDT

ON [TRN].[TRNDTL_SPR09] ([WK_END_DT])

INCLUDE ([TRANS_ID],[SKU_NUM],[SLS_UNITS])

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

BAGs Right there with Babe Points: 793 More actions · Answer 14

Now that i understand the communication better, let me give the scenario the way i should have started.

TABLE: DETAIL SALES TRANSACTIONS

SKU_NUM - PK & single column index

TRANS_ID - PK & single column index

LOC_NUM - single idx

CLD_DT - single idx

WK_END_DT - single idx

UNITS

SALES

MARGIN

I will attach a text file that is one of the most run queries against this table.

My thinking (right or wrong) on the individual indexes was that those are the most searched columns. Now i'm thinking i should have something different.

You should see from the query that SKU, WK_END & TRANS_ID are in the WHERE & JOINS most of the time. So if i search for transactions that contain a certain SKU should one of my indexes be SKU with an INCLUDE on WK_END & TRANS_ID or should the index be all 3 or both.

I know an execution plan would probably give you the answer, but i guess from experience first what would you think.

Also, what started all of this for me was that i started using CTE's over subqueries because i started getting better performance. This particular query runs faster in the attached format.

THANK YOU ALL FOR THE FEEDBACK