SQL view performance problem

  • I have a view that has 6 datasets created within it. Each subsequent dataset joins on previous datasets to filter out the desired results. The very first dataset gets dates from an xref table that get passed down all the way to final select statement in the view. If I hard code a date in the first data set and then run select * from the_view, it runs under 30 seconds. If I run the view as select * from the_view where date='xx-xx-xxxx' then it takes over an hour. I cannot change the view to be a stored proc or function with date passed as parameter because the end users are only granted permissions on views. See the pseudo-code of the view below:

    WITH Date_Dataset ( Effective_Dt )

    AS ( SELECT DISTINCT

    Effective_Dt

    FROM Date_Xref

    ),

    TcklrTypeDataset

    AS ( SELECT A,B,C

    FROM Tickler_Table

    INNER JOIN Date_Dataset ON Tickler_Table.Date = Load_ID_Dataset.Date

    INNER JOIN dbo.tbl_Dim_Tcklr_Type tcklrtype ON tcklrtype.Tcklr_Type_Key = tcklr.Tcklr_Type_Key

    WHERE tcklrtype.Tcklr_Type_Cd IN ( '062', '063', '064', '065' )

    ),

    Dataset2 AS (select x,y,z from Main_Table inner join TcklrTypeDataset)

    select * from Dataset2

    So when user queries the view as select * from the_View where Date = 'xx-xx-xxxx', it runs for more than an hour. I need to be able to streamline this view so I don't have to hard code a date in first dataset as users query for different dates.

  • There's no way anyone can help you from what you posted. You'll need to follow the indications in this article to get actual help: http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    It's not that we don't want to help, it's just that we can't see the problem.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • There's no way to pass parameters to a view. Either the users apply the date filter when they query the view, or you'll need to change to a procedure or in-line table valued function

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • farazkhan1981 (9/7/2016)


    I have a view that has 6 datasets created within it. ...

    Those aren't datasets, they are CTE's, which are somewhat like views in that they are more or less substituted into the body of the query at run time rather than creating intermediate result sets. Chaining CTE's in this manner masks complexity - often the reason for choosing CTE's in the first place - and most devs know from experience that this may result in extremely poor performance as the optimiser is unable to guarantee that a good plan will be picked from the endless possibilities, and times out. Can you rewrite the code without chaining so many CTE's?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • There is a potential problem right up at the front. The codes starts with a "Date_Dataset" cte that does a SELECT DISTINCT to produce a list of values. The optimizer doesn't know what values might come out of that select distinct, and so has no statistics to help it find a more efficient execution plan. This forces the optimizer to take a most conservative approach. Try putting those results in a #distinct table indexed on the date column. Then join to that table rather than to the cte.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (9/8/2016)


    There is a potential problem right up at the front. The codes starts with a "Date_Dataset" cte that does a SELECT DISTINCT to produce a list of values. The optimizer doesn't know what values might come out of that select distinct, and so has no statistics to help it find a more efficient execution plan. This forces the optimizer to take a most conservative approach. Try putting those results in a #distinct table indexed on the date column. Then join to that table rather than to the cte.

    Actually, I think that section should be rewritten as a WHERE EXISTS. He's only pulling one field from that table, and he's using it for an inner join, so he can use the other side of the join in subsequent steps.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • You're right, Drew. If he's just using it to filter by, there is no need for the join.

    In fact, if the original Date_Xref is indexed on date, unique or not, an EXISTS should produce the right result without having to do the select distinct at all.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • As others have said, CTE's are just for easy formatting, they're all run together at execution. What you're effectively running is this...

    select *

    from (select x,y,z

    from Main_Table

    inner join ( SELECT A,B,C

    FROM Tickler_Table

    INNER JOIN ( SELECT DISTINCT Effective_Dt FROM Date_Xref) ON Tickler_Table.Date = Load_ID_Dataset.Date

    INNER JOIN dbo.tbl_Dim_Tcklr_Type tcklrtype ON tcklrtype.Tcklr_Type_Key = tcklr.Tcklr_Type_Key

    WHERE tcklrtype.Tcklr_Type_Cd IN ( '062', '063', '064', '065' )

    ) -- missing a join condition here

    ) as alias

    Those subselects are making it diffcult for SQL to pick a good plan. I'd write it as a single select, nothing too complex here.

    The difference in performance will be because with the hardcoded date, the where is in the ( SELECT DISTINCT Effective_Dt FROM Date_Xref) part.

    When run selecting from the view, it's at the end of the query.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply