May 12, 2020 at 7:54 pm
Hey all,
We've made some optimization to this view and the indexing. Actually had it running quite well as of the last several weeks. As of these past couple days, the query is taking far too long to return a full data set.
I can do a select * from my local 2017 instance, in about 6 minutes. In Azure SQL (P4), it ... never moves. After an hour, I had to cancel it, it just hangs and the DTU's ceiling at 100%. Note the attached queryplan_azure. That RID lookup is nasty ... yet, my local (queryplan_actual) doesn't go that same path? I've updated stats, no change.
DistributorSkuSourceTransactions is the bottle neck. I've tried multiple variations of indexing, even a covering, all to no avail. There is no clustered index on this table as of now. The PK is a uniqueidentifier.
Any suggestions?
Query:
with cteRowNumber as (
select dsst.TransactionHash, a.CrmId AS AccountId, ps.StartDate, ps.EndDate,
row_number() over(partition by dsst.TransactionHash, a.crmId order by ps.created desc) as RowNum
from AffiliateMatches am WITH (READPAST)
JOIN Accounts a WITH (READPAST)
ON am.AccountId = a.Id AND AM.IsDeleted <> 1
JOIN PrimarySources ps WITH (READPAST)
ON a.id = ps.AccountId
JOIN DistributorSkuSourcePrimarySources dssps WITH (READPAST)
ON ps.Id = dssps.PrimarySourceId
JOIN Unicorns u WITH (READPAST)
ON am.UnicornId = u.Id
JOIN DistributorSkuSourceTransactions dsst WITH (READPAST)
ON u.UnicornHash = dsst.UnicornHash and dssps.DistributorSkuSourceId = dsst.DistributorSkuSourceId and dsst.TransactionDate >= ps.StartDate AND (dsst.TransactionDate <= ps.EndDate OR ps.EndDate IS NULL)
)
select TransactionHash, AccountId, StartDate, EndDate
from cteRowNumber
where RowNum = 1
GO
Plans are attached.
May 13, 2020 at 1:34 pm
So found something odd. It's the join on the date that is killing this. I converted to a where, tried a between, no change:
and dsst.TransactionDate >= ps.StartDate AND (dsst.TransactionDate <= ps.EndDate OR ps.EndDate IS NULL))
When looking at the stats on this column, the major difference I saw between my local (old) copy of the database and in production is that there are 17 records on future dates. Would that be problematic?
May 13, 2020 at 4:24 pm
From my personal observations, "OR" in criteria is more often a problem than not and, with that, I tell people to make their EndDate column NOT NULL and have it default to '9999', which resolves to 9999-01-01 for a date. I don't use 9999-12-31 because it leaves no room for formulas that do a "+1" of any kind.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 13, 2020 at 6:00 pm
Hey Jeff, I remember your name ... way back when in my DBA days. Thanks for the response. My DBA skills are relatively rusty these days, went and got stuck in management for many years!
Unfortunately for now, can't change that data ... any short term suggestions? You're right though, it's absolutely EndDate. The vast majority of the records are NULL when looking at the stats. Doesn't matter using the OR or not, but using the dsst.TransactionDate >= ps.EndDate is where it breaks.
May 13, 2020 at 8:13 pm
Understood on not changing the data in the existing column. It could really screw up a lot of code that's depending on the bloody NULLs.
If you can add a persisted computed column to change the NULLs to 9999-01-01 and modify the code you're currently working with to look at that as an end date, you wouldn't endanger other code. It would also allow you to change future found code with relative impunity is that you could still return the NULL in the original column while doing a SARGable search on the computed column.
Other than that, most of the other methods will leave some problems. I also find that temporary solutions are frequently adopted as permanent solutions because they fall into the horrible area of "good enough" when they actually aren't.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply