March 22, 2016 at 10:06 am
Hi all. I've done my homework on this one and I'm perplexed. I'm an old hand at optimizing queries and stored procedures but I've been solidly in the Ops world for a couple years and may be a bit rusty. I'm hoping someone has some insight.
I had a stored procedure that behaved very poorly.
Select top 500 STRef2.KeyValue, STRef2.ExtraData From
(Select SourceTable.KeyValue From SourceTable
EXCEPT
Select KeyValue From TargetTable) a
inner join SourceTable STRef2 on a.KeyValue = STRef2.KeyValue
The problem here is the tables are very large - hundreds of millions of rows. So my typical reaction is "Reduce the working set". Which is readily done with a date filter.
That lead me to this:
DECLARE @hourstoprocess int
SET @hourstoprocess = -24
;WITH SourceCTE AS (
SELECT KeyValue,
ExtraData
FROM SourceTable
WHERE startdatetimeutc >= DATEADD(hour, @hourstoprocess, getdate())
),
DestinationCTE AS (
SELECTDISTINCT KeyValue
FROM DestinationTable
)
SELECTTOP 500
a.KeyValue,
a.ExtraData
FROM SourceCTE a
LEFT OUTER JOIN DestinationCTE b ON a.KeyValue = b.KeyValue
WHERE b.KeyValue is null
This is a rough estimation of the code with names changed to protect the innocent. Originally the variable was a parameter but I've removed it and created it as a variable inside the procedure. The trouble here is if I run the code inside the procedure outside of the procedure I get results rather quickly. If I run it inside the procedure it takes 10-50 times longer. Additionally, if I replace the variable in the dateadd function with a value it runs as if it wasn't in the stored procedure.
Near as I can tell this might be considered something akin to parameter sniffing. I've seen that the solution to that can be to force a recompile for each run of the procedure. So I've tried that with no luck.
I thought maybe if I didn't do the datemath in the CTE it would help so I created a date variable to use for the right side of the where clause in the SourceCTE. No luck.
I've tried to optimize for the value -24 and for UNKNOWN. Also no luck.
I've tried doing this with temp tables and table variables instead of CTEs. Neither of those made any difference.
Looking at the Execution Plans for the difference between variable and no-variable in the SourceCTE the only difference is the cost of the relative steps. The steps are otherwise identical.
So I'm turning to the friendly folks here to see if anyone has an idea.
March 22, 2016 at 10:27 am
Can you post the actual execution plans? (One for a poorly performing version and one for the faster version.)
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
March 22, 2016 at 10:46 am
And it looks like I missed the whole attachments thing. Here we go
March 22, 2016 at 11:01 am
LerxtDBA (3/22/2016)
And it looks like I missed the whole attachments thing. Here we go
Please post the actual execution plans, these images of the graphical plan representation are just like the menu, we need to taste the food;-)
π
Right click on the execution plan and select Save As...
March 22, 2016 at 11:07 am
Unfortunately that I cannot do. It would reveal more about my customer than I'm allowed to show.
March 22, 2016 at 11:25 am
Actually, here we go. Find and replace to remove names.
March 22, 2016 at 12:08 pm
LerxtDBA (3/22/2016)
Actually, here we go. Find and replace to remove names.
That's better, BTW you can use SqlSentry Plan Explorer to anonymize the execution plans.
One more thing, could you post the DDL (Create table) for the tables? By the looks of this there is an implicit conversion from datetime to datetime2(3), not going to help.
π
First thought is to get rid of the local variable, the optimizer cannot know the value beforehand and will instead use a rule of thumb (normally 1/3) to guestimate the cardinality. The good plan has a fixed predicate value for the [InteractionSegmentDetail] table of -1800 and scanning 7443500 rows, the bad plan scans 13194800 rows which is close to 1/3 of the table cardinality (43972300)
Few things, you are using trace flag 4136 for the bad query and not the good one, any reason? Also the good query has top 10000 while the other has top 12000???
When doing performance comparison it is absolutely essential to compare like for like!
March 22, 2016 at 12:13 pm
All those are incidental at this point. The "good" plan has a fixed value because it's a production stored proc. The bad plan has those differences because I was testing things I had found elsewhere online.
Hmmm...I wonder...
I've been running under the assumption I need the variable because I have different criteria at different times of day. But if I replace the stored proc with 3 stored procs - one that is a "hub" that calls the other two depending on time of day that may fix the issue and allow me to remove the variable entirely.
March 22, 2016 at 12:34 pm
LerxtDBA (3/22/2016)
All those are incidental at this point. The "good" plan has a fixed value because it's a production stored proc. The bad plan has those differences because I was testing things I had found elsewhere online.Hmmm...I wonder...
I've been running under the assumption I need the variable because I have different criteria at different times of day. But if I replace the stored proc with 3 stored procs - one that is a "hub" that calls the other two depending on time of day that may fix the issue and allow me to remove the variable entirely.
Two obvious (kind of) solutions, first one is to change the variable to a parameter for the procedure, the second one is to do a parameterized dynamic sql inside the procedure with a recompile hint.
π
March 22, 2016 at 12:54 pm
Originally it was a parameter and it was misbehaving even then.
I didn't think about the dynamic SQL route but that's because I'm mostly opposed to it on principal. π
Anyway, the trio of procedures is working better than I expected. I went from several minutes on the original version (not mine), to 30-ish seconds with a hard-coded value instead of the variable/parameter, to 10+ minutes with the variable, to under 1 second using a pair of hard-coded values in different stored procs.
Oddly, the execution plan is wildly different even from the hard coded value one. 98% of the cost is in the non-clustered index seek. That's pretty much what I expect when I have a SELECT DISTINCT in a CTE that is being used to limit my working set.
March 22, 2016 at 1:21 pm
LerxtDBA (3/22/2016)
Originally it was a parameter and it was misbehaving even then.I didn't think about the dynamic SQL route but that's because I'm mostly opposed to it on principal. π
Anyway, the trio of procedures is working better than I expected. I went from several minutes on the original version (not mine), to 30-ish seconds with a hard-coded value instead of the variable/parameter, to 10+ minutes with the variable, to under 1 second using a pair of hard-coded values in different stored procs.
Oddly, the execution plan is wildly different even from the hard coded value one. 98% of the cost is in the non-clustered index seek. That's pretty much what I expect when I have a SELECT DISTINCT in a CTE that is being used to limit my working set.
There is nothing wrong with using dynamic sql if done properly and it's parameterized π
Have you tried adding option recompile to the original one?
π
March 23, 2016 at 6:44 am
I tried the option recompile with no success.
Dynamic SQL...I use it frequently but I don't like to open it up on things that I don't have absolute control of. I'm just modifying a stored proc and I don't trust the devs to not leave it wide open to injection.
March 23, 2016 at 9:39 pm
Not sure if it helps but I had similar experience and thought I could share my ideas and you may already tried them but just in case..
Clustered indexes on SourceTable & DestinationTable on KeyValue
Non Clustered index on startdatetimeutc in SourceTable
(PS: Make sure fill factor 80 or 90 or 100)
Store the dateadd value into variable then use that in where condition
(Here also, looks like you are just after recent 24 hours
then I would say pull year and month values from getdate() and filter like year(startdatetimeutc)=2016 and month(startdatetimeutc)=3 and then store that sub set of data in temp table and create non clustered index on that temp table again on column startdatetimeutc and use this in join logic)
Lastly, since you are not really retrieving DestinationTable columns, instead of "left join and where is null" , you may try to use "where KeyValue not in (select KeyValue from DestinationCTE)"
All the best! π
Thanks.
March 24, 2016 at 2:05 am
I guess the query part in CTE segmentsparsed is redundant and also in the final part of your query you are checking for non existence of the record in Custom_SegmentsParsed with a left join.This can be easily replaced with an left anti semi join using a Non Exists in the CTE Custom_SegmentsParsed itself.Be aware Non Exists performance is horrendous if there are to many null values.
;WITH segment
AS (SELECT interactionidkey,
segmentlog
FROM interactionsegmentdetail WITH(nolock)
WHERE startdatetimeutc >= Dateadd(hour, @ProcessHours, Getdate())),
segmentsparsed
AS (SELECT DISTINCT TOP 12000 a.interactionidkey,
a.segmentlog
FROM segment a WITH(nolock)
WHERE NOT EXISTS(SELECT 1
FROM Custom_SegmentsParsed b
WHERE a.interactionidkey = b.interactionidkey))
SELECT *
FROM segmentsparsed
InteractionSegmentDetail is doing a CI seek on the table for the predicate >= DATEADD(hour, @ProcessHours, getdate()).
Whats the record count in the table InteractionSegmentDetail ? If there are to many qualifying rows for a predicate I think the query will benefit more from a scan than from a seek.
--------------------------------------------------------------------------------------------------
I am just an another naive wannabe DBA trying to learn SQL Server
March 24, 2016 at 3:44 am
-- These are estimated plans. Row-goal queries often display big differences between estimated and actual row counts - can you post the actual plans please?
-- the "good" plan shows an alarmingly high memory grant for the hash tables. Try this:
SELECT TOP 10000
a.InteractionIdKey,
a.SegmentLog
FROM InteractionSegmentDetail a
LEFT OUTER loop JOIN Custom_SegmentsParsed b
ON a.InteractionIDKey = b.InteractionIDKey
WHERE a.startdatetimeutc >= DATEADD(hour, -1800, getdate())
AND b.InteractionIDKey IS NULL
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply