SQL view performance problem

Question

SQL view performance problem

farazkhan1981

Old Hand

Points: 398
More actions
September 7, 2016 at 12:52 pm

#313201

I have a view that has 6 datasets created within it. Each subsequent dataset joins on previous datasets to filter out the desired results. The very first dataset gets dates from an xref table that get passed down all the way to final select statement in the view. If I hard code a date in the first data set and then run select * from the_view, it runs under 30 seconds. If I run the view as select * from the_view where date='xx-xx-xxxx' then it takes over an hour. I cannot change the view to be a stored proc or function with date passed as parameter because the end users are only granted permissions on views. See the pseudo-code of the view below:
WITH Date_Dataset ( Effective_Dt )
AS ( SELECT DISTINCT
Effective_Dt
FROM Date_Xref
),
TcklrTypeDataset
AS ( SELECT A,B,C
FROM Tickler_Table
INNER JOIN Date_Dataset ON Tickler_Table.Date = Load_ID_Dataset.Date
INNER JOIN dbo.tbl_Dim_Tcklr_Type tcklrtype ON tcklrtype.Tcklr_Type_Key = tcklr.Tcklr_Type_Key
WHERE tcklrtype.Tcklr_Type_Cd IN ( '062', '063', '064', '065' )
),
Dataset2 AS (select x,y,z from Main_Table inner join TcklrTypeDataset)
select * from Dataset2
So when user queries the view as select * from the_View where Date = 'xx-xx-xxxx', it runs for more than an hour. I need to be able to streamline this view so I don't have to hard code a date in first dataset as users query for different dates.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Luis Cazares SSC Guru Points: 183706 More actions · Answer 1

There's no way anyone can help you from what you posted. You'll need to follow the indications in this article to get actual help: http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

It's not that we don't want to help, it's just that we can't see the problem.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

There's no way to pass parameters to a view. Either the users apply the date filter when they query the view, or you'll need to change to a procedure or in-line table valued function

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

farazkhan1981 (9/7/2016)
I have a view that has 6 datasets created within it. ...

Those aren't datasets, they are CTE's, which are somewhat like views in that they are more or less substituted into the body of the query at run time rather than creating intermediate result sets. Chaining CTE's in this manner masks complexity - often the reason for choosing CTE's in the first place - and most devs know from experience that this may result in extremely poor performance as the optimiser is unable to guarantee that a good plan will be picked from the endless possibilities, and times out. Can you rewrite the code without chaining so many CTE's?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 4

There is a potential problem right up at the front. The codes starts with a "Date_Dataset" cte that does a SELECT DISTINCT to produce a list of values. The optimizer doesn't know what values might come out of that select distinct, and so has no statistics to help it find a more efficient execution plan. This forces the optimizer to take a most conservative approach. Try putting those results in a #distinct table indexed on the date column. Then join to that table rather than to the cte.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

drew.allen SSC Guru Points: 76998 More actions · Answer 5

The Dixie Flatline (9/8/2016)
There is a potential problem right up at the front. The codes starts with a "Date_Dataset" cte that does a SELECT DISTINCT to produce a list of values. The optimizer doesn't know what values might come out of that select distinct, and so has no statistics to help it find a more efficient execution plan. This forces the optimizer to take a most conservative approach. Try putting those results in a #distinct table indexed on the date column. Then join to that table rather than to the cte.

Actually, I think that section should be rewritten as a WHERE EXISTS. He's only pulling one field from that table, and he's using it for an inner join, so he can use the other side of the join in subsequent steps.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 6

You're right, Drew. If he's just using it to filter by, there is no need for the join.

In fact, if the original Date_Xref is indexed on date, unique or not, an EXISTS should produce the right result without having to do the select distinct at all.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Gazareth One Orange Chip Points: 27737 More actions · Answer 7

As others have said, CTE's are just for easy formatting, they're all run together at execution. What you're effectively running is this...

select *

from (select x,y,z

from Main_Table

inner join ( SELECT A,B,C

FROM Tickler_Table

INNER JOIN ( SELECT DISTINCT Effective_Dt FROM Date_Xref) ON Tickler_Table.Date = Load_ID_Dataset.Date

INNER JOIN dbo.tbl_Dim_Tcklr_Type tcklrtype ON tcklrtype.Tcklr_Type_Key = tcklr.Tcklr_Type_Key

WHERE tcklrtype.Tcklr_Type_Cd IN ( '062', '063', '064', '065' )

) -- missing a join condition here

) as alias

Those subselects are making it diffcult for SQL to pick a good plan. I'd write it as a single select, nothing too complex here.

The difference in performance will be because with the hardcoded date, the where is in the ( SELECT DISTINCT Effective_Dt FROM Date_Xref) part.

When run selecting from the view, it's at the end of the query.