Tuning a query that takes 60 minutes to run,

Question

Tuning a query that takes 60 minutes to run,

Viewing 4 posts - 31 through 33 (of 33 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

rajsin7786 (8/4/2014)
sure, just finished running the exec sp_updatestat, all stats are updated on this DB now. Running the SQL now, will keep you posted.

No, not sp_updatestats (which does a sampled stats update of every single table), an UPDATE STATISTICS, with the FULLSCAN option on the sales fact table (and probably worth doing on all other tables in the query too, to fix any damage that sp_updatestats could have done)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 2

-- The plan doesn't seem to work at all well with the actual rowcounts, probably due to out-of-date stats as others have suggested.

-- You can probably eliminate the 500million row range scan by including the columns it's picking up in an adjusted version of index idx_sales_prod_2

-- and also eliminate the key lookup along the way - which is almost certainly a lot more expensive than it looks.

-- You MAY get a performance lift from making the date arithmetic a little more obvious too.

CREATE INDEX idx_sales_prod_2 ON sales_fact (product_id, date_id) INCLUDE (trans_ref, LoyaltyNo, qty, totalPriceExVat, company_margin, wastageval, multibuy_id)

SELECT

sales_fact.trans_ref,

product_dim.pd_key,

product_dim.pd_desc,

multibuy_dim.mb_deal_number,

sales_fact.LoyaltyNo,

sum(sales_fact.qty),

sum(sales_fact.totalPriceExVat),

( sum(sales_fact.company_margin) ) - isnull((( sum(sales_fact.wastageval) )),0)

FROM multibuy_dim

INNER JOIN sales_fact ON (sales_fact.multibuy_id=multibuy_dim.mb_id)

INNER JOIN product_dim ON (sales_fact.product_id=product_dim.pd_id)

CROSS APPLY (

SELECT

StartDate = DATEADD(WEEK, -3, MIN(date)),

EndDate = MAX(date)

FROM date_dim

WHERE last_week_prior_flag = 'Y'

) x

INNER JOIN date_dim ON date_dim.date_id=sales_fact.date_id

AND date_dim.date BETWEEN x.StartDate AND x.EndDate

WHERE

(

product_dim.pd_key IN ('481005', '850972', '860091', '860101', '860110', '860130', '860159', '860161', '860211', '860224', '860225', '860230', '860265', '860319', '860320', '860344', '860360',

'860407', '860414', '860415', '860418', '860469', '860478', '871005', '874115', '874206', '880100', '881153', '890061', '890299', '890360', '890648', '890650', '890651', '891903')

AND

(

multibuy_dim.mb_deal_number = '0000'

OR

multibuy_dim.mb_deal_number Is Null

)

GROUP BY

sales_fact.trans_ref,

product_dim.pd_key,

product_dim.pd_desc,

multibuy_dim.mb_deal_number,

sales_fact.LoyaltyNo

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

rajsin7786 Right there with Babe Points: 718 More actions · Answer 3

Hi guys,

thanks for all your help, i have updated statistics on all the columns, I ran the query suggested by ChrisM and also the original query , both now run in 25 minutes. There is definitely a performance improvement. i have attached the actual execution plan

Hi chrism,

could you please tell me how to go about the below suggestions,

-- You can probably eliminate the 500million row range scan by including the columns it's picking up in an adjusted version of index idx_sales_prod_2

-- and also eliminate the key lookup along the way - which is almost certainly a lot more expensive than it looks.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

rajsin7786 (8/11/2014)
Hi guys,
thanks for all your help, i have updated statistics on all the columns, I ran the query suggested by ChrisM and also the original query , both now run in 25 minutes. There is definitely a performance improvement. i have attached the actual execution plan
Hi chrism,
could you please tell me how to go about the below suggestions,
-- You can probably eliminate the 500million row range scan by including the columns it's picking up in an adjusted version of index idx_sales_prod_2
-- and also eliminate the key lookup along the way - which is almost certainly a lot more expensive than it looks.

The definition of the index I suggest you create is in my last post. The huge difference between actual and estimated rows suggests that statistics were not updated. Note also that SQL Server timed out before picking a good plan, and it does look horrible since it reads the sales table three times - including 500,000,000 index seeks. Try creating the index, updating statistics for all tables involved except sales (the new index will do) and running the two queries again.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden