June 25, 2016 at 6:07 am
Hello! I am running a query which is taking 23 mins to run in my dev server. I need your expert advice on how to improve the performance. I can change the architecture a little bit as it is still in dev environment.
declare
@market varchar(5),
@EIA_plant_id int = 999999,
@EIA_unit_id varchar(16) = '0000',
@data_type int,
@startdate datetime,
@enddate datetime,
@starthour int,
@endhour int
set @market = N'DA',
set @data_type = 0
set @startdate = N'2015-12-01'
set @enddate = N'2015-12-10'
set @starthour = 0
set @endhour = 23
-- DA Raw
IF @market = 'DA' and @data_type = 0
BEGIN
SELECT
--CASE WHEN max(gb.bidding_group_id) is not null or @EIA_unit_id != '0000' THEN min(gu.unit_id) ELSE max(gu.unit_id) END as unit_id,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN max(gu.unit_name) ELSE max(gb.bidding_group_name) END as unit_name
,max(gu.[EIA_plant_code]) as EIA_plant_ID
,CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN max(gu.eia_unit_id) ELSE NULL END as EIA_unit_ID,
max(br.Region) as region,
--max(br.[Owner Code]) as owner_code,
--br.[Unit Code] as bid_code_id,
max(br.[Unit Type]) as unit_type_code,
br.[Date Time Beginning (est)] AS [date_beginning_EST],
max(br.price1) AS bid1,
max(br.price2) AS bid2,
max(br.price3) AS bid3,
max(br.price4) AS bid4,
max(br.price5) AS bid5,
max(br.price6) AS bid6,
max(br.price7) AS bid7,
max(br.price8) AS bid8,
max(br.price9) AS bid9,
max(br.price10) AS bid10,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw1)) ELSE max(br.mw1) END AS MW1,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw2)) ELSE max(br.mw2) END AS MW2,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw3)) ELSE max(br.mw3) END AS MW3,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw4)) ELSE max(br.mw4) END AS MW4,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw5)) ELSE max(br.mw5) END AS MW5,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw6)) ELSE max(br.mw6) END AS MW6,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw7)) ELSE max(br.mw7) END AS MW7,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw8)) ELSE max(br.mw8) END AS MW8,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw9)) ELSE max(br.mw9) END AS MW9,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.mw10)) ELSE max(br.mw10) END AS MW10,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.[economic min])) ELSE max(br.mw1) END AS economic_min,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.[economic max])) ELSE max(br.mw1) END AS economic_max,
max(convert(decimal(4,2),br.[Slope])) AS slope,
max(convert(decimal(4,2),br.[Must Run Flag])) AS must_run_flag,
max(convert(decimal(4,2),br.[Unit Available Flag])) AS unit_available_flag,
max(br.[LMP]) AS LMP_hourly,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.[MW])) ELSE max(br.[MW]) END AS MW_hourly,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.[Self Scheduled MW])) ELSE max(br.[Self Scheduled MW]) END AS MW_self_schedule,
max(convert(decimal(4,2),br.[Economic Flag])) AS economic_flag,
max(convert(decimal(4,2),br.[Emergency Flag])) AS emergency_flag,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.[Emergency Min])) ELSE max(br.[Emergency Min]) END AS emergency_min,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.[Emergency Max])) ELSE max(br.[Emergency Max]) END AS emergency_max,
CASE WHEN max(gb.bidding_group_id) is null or @EIA_unit_id != '0000' THEN convert(decimal(6,2),max(uGen.[weight] * br.[Target MW Reduction])) ELSE max(br.[Target MW Reduction]) END AS target_MW_reduction,
max(br.[Curtailment Offer Price]) AS curtailment_offer_price
--max(gu.bidding_group_id) as bid_group_id,
--max(uGen.[weight]) as UC_weight
--max(CONVERT(int,br.bitDataChecked)) AS bit_data_checked,
FROM dbo.global_unit gu
INNER JOINdbo.unit_to_gennode AS uGen ON uGen.unit_id=gu.unit_id
INNER JOIN dbo.gennode_to_unit_code AS guCode ON uGen.gennode = guCode.gennode
INNER JOIN dbo.bids_raw_da_ AS br ON guCode.unit_code = br.[Unit Code] --THIS TABLE HAS 30 million woes
LEFT JOIN dbo.global_bidding_group gb on gb.bidding_group_id = gu.bidding_group_id
WHERE
@EIA_plant_id = CASE WHEN @EIA_plant_id != 999999 THEN gu.EIA_plant_code ELSE 999999 END
AND @EIA_unit_id = CASE WHEN @EIA_unit_id != '0000' THEN gu.EIA_unit_id ELSE '0000' END
--AND (bitDataChecked = 1 or bitDataChecked = 0)
AND [Date Time Beginning (est)] >= @startdate
AND [Date Time Beginning (est)] <= @enddate
AND datepart(hour, br.[Date Time Beginning (EST)]) >= @starthour
and datepart(hour, br.[Date Time Beginning (EST)]) <= @endhour
--AND starthour >= @starthour -- i tried to add persisted computed columns on Date Time Beginning (est). no luck.
--AND starthour<= @endhour
GROUP BY CASE WHEN gb.bidding_group_id is not null THEN gb.bidding_group_id ELSE gu.unit_id END, [Date Time Beginning (EST)], [Unit Code]
ORDER BY max(gu.unit_id), date_beginning_EST, [Unit Code]
--option(hash join,merge join)
END
What are the things i should change to make it run it better. please help.
June 25, 2016 at 6:08 am
query execution plan is attached.
June 25, 2016 at 7:03 am
You'll need to post the DDL for the tables you are querying for a better answer. Im not at a SQL Server so I can't look at the execution plan but here's what is jumping out at me: the ORDER BY max() clause at the end. That is certainly one thing that's wrecking your performance and isn't necessary to get the correct results. Losing that is your "low-hanging fruit" performance fix. Let the application sort your result set.
-- Itzik Ben-Gan 2001
June 25, 2016 at 7:49 am
1) Your WHERE clause is VERY complex for the optimizer, leading to bad estimates and both a suboptimal plan and multiple spills to tempdb for sorting.
2) You have functions around columns in the WHERE clause, which contributes to the above and also can prevent the use of index seeks (if/when they are appropriate).
3) Maybe your dev is underpowered π
4) You have date range filtering, which is almost an automatic OPTION (RECOMPILE) for the statement. I promise you don't want the same query plan for 1 day range as you do for a 10 year range!
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
June 27, 2016 at 2:52 am
CASE statement in the GROUP BY clause is also going to lead to some pretty poor choices by the optimizer.
If you have two different sets of logic, rather than try to cram them into a single procedure, create two procedures and call them from a wrapper procedure based on the needs of the separated logic. That's going to give the optimizer a better chance at arriving at a good plan.
Plus everything already pointed out.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
June 27, 2016 at 2:22 pm
Thank you so much. I will try to make the changes and will get back with more questions.
I really appreciate you all for taking out your time and helping me. π
June 28, 2016 at 6:39 pm
I've got a question about this piece of code:
AND [Date Time Beginning (est)] >= @startdate
AND [Date Time Beginning (est)] <= @enddate
AND datepart(hour, br.[Date Time Beginning (EST)]) >= @starthour
and datepart(hour, br.[Date Time Beginning (EST)]) <= @endhour
If you have, say,
@starthour = 8 and
@endhour = 10
the query will return events beginning only within 3 hours of each day between @startdate and @enddate.
For example, if @startdate = 27 June and @enddate = 29 June the query will select events which started :
27 June between 8:00am and 10:59:59.997 am,
28 June between 8:00am and 10:59:59.997 am,
29 June between 8:00am and 10:59:59.997 am.
Is it what is expected?
_____________
Code for TallyGenerator
July 5, 2016 at 1:20 pm
yes this is expected. I need data in that particular range of hours in order to keep the result set small.
July 5, 2016 at 6:02 pm
From the plan I can see the column [Date Time Beginning (EST)] belongs to table [bids_raw_da_miso]
But I don't see this table anywhere in the query.
And if you select by range then you better have a clustered index on [bids_raw_da_miso].[Date Time Beginning (EST)], or a covering index with [Date Time Beginning (EST)] as a 1st column and including other columns from that table which are used in the query.
_____________
Code for TallyGenerator
July 6, 2016 at 2:19 am
ekant_alone (7/5/2016)
yes this is expected. I need data in that particular range of hours in order to keep the result set small.
So your range might be several days long, capturing say 9am to 11am for each day?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply