July 25, 2014 at 8:37 am
Hello,
I have a SQL query that takes 60 minutes to complete, the query is selecting data from 4 tables, out of which one is a fact table having 1.5 billion records. I am pasting the query below, one thing i noticed was, if i remove the product_dim table from the SQL, then it runs in 8 minutes . Would be great if someone could advise on this, thanks in advance.
SQL taking 50 minutes,
SELECT
Live.dbo.sales_fact.trans_ref,
Live.dbo.product_dim.pd_key,
Live.dbo.product_dim.pd_desc,
Live.dbo.multibuy_dim.mb_deal_number,
Live.dbo.sales_fact.LoyaltyNo,
sum(Live.dbo.sales_fact.qty),
sum(Live.dbo.sales_fact.totalPriceExVat),
( sum(Live.dbo.sales_fact.company_margin) ) - isnull((( sum(Live.dbo.sales_fact.wastageval) )),0)
FROM
Live.dbo.multibuy_dim INNER JOIN Live.dbo.sales_fact ON (Live.dbo.sales_fact.multibuy_id=Live.dbo.multibuy_dim.mb_id)
INNER JOIN Live.dbo.product_dim ON (Live.dbo.sales_fact.product_id=Live.dbo.product_dim.pd_id)
INNER JOIN Live.dbo.date_dim ON (Live.dbo.date_dim.date_id=Live.dbo.sales_fact.date_id)
WHERE
(
Live.dbo.product_dim.pd_key IN ('481005', '850972', '860091', '860101', '860110', '860130', '860159', '860161', '860211', '860224', '860225', '860230', '860265', '860319', '860320', '860344', '860360', '860407', '860414', '860415', '860418', '860469', '860478', '871005', '874115', '874206', '880100', '881153', '890061', '890299', '890360', '890648', '890650', '890651', '891903')
AND
(
Live.dbo.multibuy_dim.mb_deal_number = '0000'
OR
Live.dbo.multibuy_dim.mb_deal_number Is Null
)
AND
( ( Live.dbo.date_dim.date ) between DATEADD(WEEK, -3, (select min(date) from date_dim where last_week_prior_flag = 'Y')) AND (select max(date) from Live.dbo.date_dim where Live.dbo.date_dim.last_week_prior_flag= 'Y' ) )
)
GROUP BY
Live.dbo.sales_fact.trans_ref,
Live.dbo.product_dim.pd_key,
Live.dbo.product_dim.pd_desc,
Live.dbo.multibuy_dim.mb_deal_number,
Live.dbo.sales_fact.LoyaltyNo
SQL taking 8 minutes, ( i removed product_dim from the query)
SELECT
Live.dbo.sales_fact.trans_ref,
Live.dbo.multibuy_dim.mb_deal_number,
Live.dbo.sales_fact.LoyaltyNo,
sum(Live.dbo.sales_fact.qty),
sum(Live.dbo.sales_fact.totalPriceExVat),
( sum(Live.dbo.sales_fact.company_margin) ) - isnull((( sum(Live.dbo.sales_fact.wastageval) )),0)
FROM
Live.dbo.multibuy_dim INNER JOIN Live.dbo.sales_fact ON (Live.dbo.sales_fact.multibuy_id=Live.dbo.multibuy_dim.mb_id)
INNER JOIN Live.dbo.date_dim ON (Live.dbo.date_dim.date_id=Live.dbo.sales_fact.date_id)
WHERE
(
(
Live.dbo.multibuy_dim.mb_deal_number = '0000'
OR
Live.dbo.multibuy_dim.mb_deal_number Is Null
)
AND
( ( Live.dbo.date_dim.date ) between DATEADD(WEEK, -3, (select min(date) from date_dim where last_week_prior_flag = 'Y')) AND (select max(date) from Live.dbo.date_dim where Live.dbo.date_dim.last_week_prior_flag= 'Y' ) )
)
GROUP BY
Live.dbo.sales_fact.trans_ref,
Live.dbo.multibuy_dim.mb_deal_number,
Live.dbo.sales_fact.LoyaltyNo
July 25, 2014 at 8:41 am
Quick question, can you post the execution plans?
😎
July 25, 2014 at 8:57 am
Hi Eirikur Eiriksson,
Thanks for your response
I've attached two SQL plans here, one is with product_dim and one without it. do let me know if you are unable to find them.
Thanks again,
July 25, 2014 at 9:24 am
rajsin7786 (7/25/2014)
Hi Eirikur Eiriksson,Thanks for your response
I've attached two SQL plans here, one is with product_dim and one without it. do let me know if you are unable to find them.
Thanks again,
These are the estimated execution plans. We need the actual execution plans if you can get them. Take a look at Gail's article about posting performance problems. http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
July 25, 2014 at 9:59 am
In the meantime, I'd suggest you to change the coding style as 3-part and 4-part names in the column list are deprecated and will be removed on a future version of SQL Server. 😉
July 25, 2014 at 10:10 am
Also, it would help to know what indexes are defined on the tables.
July 29, 2014 at 2:57 am
Hi Gurus,
Sorry for the late reply, attaching the actual execution plan for your reference. The query was generated from a BI tool that's the reason why its got 3part 4part names in the column, however, ill accept your suggestion and will work with the BI team on this.
All the columns in the where conditions of the SQL statements have indexes defined on them. if u have any more questions please let me know.
Thanks
July 29, 2014 at 7:28 am
rajsin7786 (7/29/2014)
Hi Gurus,Sorry for the late reply, attaching the actual execution plan for your reference. The query was generated from a BI tool that's the reason why its got 3part 4part names in the column, however, ill accept your suggestion and will work with the BI team on this.
All the columns in the where conditions of the SQL statements have indexes defined on them. if u have any more questions please let me know.
Thanks
The table definitions including indexes would help.
When is the last time you updated your statistics? They seem to be pretty stale.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
July 30, 2014 at 9:40 am
Hi, the stats were run 3 weeks back. i have attached the create table scripts for 4 tables. will paste the index soon.
July 30, 2014 at 10:03 am
rajsin7786 (7/30/2014)
Hi, the stats were run 3 weeks back. i have attached the create table scripts for 4 tables. will paste the index soon.
How much data movement is in these tables?
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
July 30, 2014 at 10:05 am
hi sean,
sales_fact has 1 billion records, other 3 tables have less than 100K records
July 30, 2014 at 10:13 am
rajsin7786 (7/30/2014)
hi sean,sales_fact has 1 billion records, other 3 tables have less than 100K records
That is how many rows, not the amount of movement. The reason I ask is looking at your execution plan you have differences in some cases between estimated and actual rows in the millions. That is a good indication that your stats are stale.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
July 30, 2014 at 10:32 am
hi sean, sorry i am novice to DB so couldn't really understand what data movement means. You are correct the stats are stale. but i have a test environment, i update the stats for these tables in the test and ran the query but it took exactly the same time 50 mins to run.
did you mean how much data is added to each table per day? if so then the there are about a million records added to the sales_fact everyday
July 30, 2014 at 2:36 pm
Hello,
We just had a meeting a few months ago given by a DBA to us developers...
Big advice to us was to take the Items in a WHERE Clause (if you can), apply the logic to JOINS.
Something like this:
SELECT
Live.dbo.sales_fact.trans_ref,
Live.dbo.product_dim.pd_key,
Live.dbo.product_dim.pd_desc,
Live.dbo.multibuy_dim.mb_deal_number,
Live.dbo.sales_fact.LoyaltyNo,
sum(Live.dbo.sales_fact.qty),
sum(Live.dbo.sales_fact.totalPriceExVat),
( sum(Live.dbo.sales_fact.company_margin) ) - isnull((( sum(Live.dbo.sales_fact.wastageval) )),0)
FROM
Live.dbo.multibuy_dim INNER JOIN Live.dbo.sales_fact ON (Live.dbo.sales_fact.multibuy_id=Live.dbo.multibuy_dim.mb_id)
INNER JOIN Live.dbo.product_dim ON (Live.dbo.sales_fact.product_id=Live.dbo.product_dim.pd_id)
INNER JOIN Live.dbo.date_dim ON (Live.dbo.date_dim.date_id=Live.dbo.sales_fact.date_id)
AND ( Live.dbo.date_dim.date ) between DATEADD(WEEK, -3, (select min(date) from date_dim where last_week_prior_flag = 'Y'))
AND (select max(date) from Live.dbo.date_dim where Live.dbo.date_dim.last_week_prior_flag= 'Y' )
WHERE
(
Live.dbo.product_dim.pd_key IN ('481005', '850972', '860091', '860101', '860110', '860130', '860159', '860161', '860211', '860224', '860225', '860230', '860265', '860319', '860320', '860344', '860360', '860407', '860414', '860415', '860418', '860469', '860478', '871005', '874115', '874206', '880100', '881153', '890061', '890299', '890360', '890648', '890650', '890651', '891903')
AND
(
Live.dbo.multibuy_dim.mb_deal_number = '0000'
OR
Live.dbo.multibuy_dim.mb_deal_number Is Null
)
)
GROUP BY
Live.dbo.sales_fact.trans_ref,
Live.dbo.product_dim.pd_key,
Live.dbo.product_dim.pd_desc,
Live.dbo.multibuy_dim.mb_deal_number,
Live.dbo.sales_fact.LoyaltyNo
July 30, 2014 at 2:59 pm
churlbut (7/30/2014)
Hello,We just had a meeting a few months ago given by a DBA to us developers...
Big advice to us was to take the Items in a WHERE Clause (if you can), apply the logic to JOINS.
Whut? Why would he recommend that? WHERE or ON clause basically only matters as to the application of OUTER JOIN logic. Predicates are predicates. What reasoning and examples did he use to support this statement?
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 15 posts - 1 through 15 (of 33 total)
You must be logged in to reply to this topic. Login to reply