The execution of sub queries in a view

  • I have a view that selects data from several tables and also contains two columns that are sub queries of two additional tables. The sub queries are expensive for performance. When I select from the view but do not select the sub query columns it appears that the subqueries are executed due to the execution time, which is what I expected. However, when I compare the execution plans for selecting from the base tables and selecting from the view (minus the sub query columns) the execution plans are identical and do not show the sub query execution. However the run time for selecting from the view would suggest that the sub queries are being executed even though they do not show up in the execution plan.

    My questions are:

    Are sub queries in views always executed even if their respective columns are not selected?

    If so, why does the sub query execution not show up in the execution plan for the query?

    Are their any ways to turn off executing the sub query in a view if it is not selected?

    Thanks in advance.

  • I would have to see the code to fully understand the issue. However, no, there's no way to selectively limit the query defined within a view. The query is going to run as you define it, including sub-selects even though you're not referencing those columns.

    As far as identical query plans between the base tables and the view w/o the sub-query, again, I'd need to see the two queries that you're comparing, but if the query is different, you will get different execution plans.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you Grant for your reply and help!

    Here is the setup...

    /*VIEW DEFINITION*/

    /*NOTE: VBaseAllUsers accesses the base tables directly*/

    create view VAllUsers as

    select

    VBaseAllUsers.*,

    (

    select count(*)

    from UserStudentAssoc

    where UserStudentAssoc.UserID=VBaseAllUsers.iUser

    and AssocType=1

    and DomainID = 1

    )

    as iNumStudents,

    (select count(*)

    from UserStudentAssoc

    where UserStudentAssoc.UserID=VBaseAllUsers.iUser

    and AssocType>1 and DomainID = 1

    )

    as iNumTeamStudents,

    (select

    case

    when sum(bOutOfCompliance) is null then 0

    else sum(bOutOfCompliance)

    end

    from VStudents

    where VStudents.iCaseManager=VBaseAllUsers.iUser

    )

    as iNumOutOfCompliance,

    (select

    case

    when sum(bInCompliance) is null then 0

    else sum(bInCompliance)

    end

    from VStudents

    where VStudents.iCaseManager=VBaseAllUsers.iUser

    )

    as iNumInCompliance

    from VBaseAllUsers

    /*BASE TABLE QUERY*/

    SELECT users.id,

    fullname,

    usercode,

    usertypes.name,

    workphone,

    email

    FROM users

    JOIN usertypes

    ON usertypes.id = users.usertype

    WHERE users.id = 235

    AND users.inactive = 0

    /*VIEW QUERY, NOT REFERENCING SUB QUERIES IN VIEW, RESULTS SAME AS ABOVE QUERY*/

    SELECT iuser,

    fullname,

    usercode,

    usertype,

    workphone,

    email

    FROM vallusers

    WHERE iuser = 235

    AND inactive = 0

    Both the base table query and the view query return the same result and have identical execution plans. The view query is much slower than the base table query. When I explicitly select the sub query columns from the view the processing is then displayed in the execution plan but the execution time is roughly the same as when the sub query columns are not selected.

    It would appear that, like you said, all parts of the query are being executed even when the sub query columns are not referenced but the sub query part of the execution plan is only being displayed if the sub query columns are referenced explicitly. There also appears to be a slight amount of overhead to actually displaying the sub query section of the execution plan as the execution time is slightly longer than that that does not reference the sub query columns. They are both far slower than the query of the base tables.

    If the sub queries are being executed regardless of whether they are referenced in the select, why does the execution plan not display their contribution to the execution if the columns are not explicitly referenced? It appears that time is being consumed in running the sub queries.

    I have uploaded screen captures of the execution plans for:

    Execution plan of the view query not referencing the subqueries

    Execution plan of the base tables query

    Execution plan of the view query with explicit reference to the sub query columns.

    Thanks in advance for you help!

  • I just did some tests on my own. I'm getting the same results you are. I never really tried this before, breaking a view down in this way. I'm seeing two different execution plans for two different queries against a view. I'm a little surprised that SQL Server is that smart. This could be a good blog post or a short article... anyway...

    I'm also seeing two different query execution times. When I run the sample query and select the column that includes a sub-select similar to yours, I'm getting more scans & reads than I get when I run the sub-select without it. It also seems to run faster (although I'd need better test data than I'm currently working with to confirm this). Can you get the STATISTICS IO from the two queries against the view? I'll keep working with this a bit on my end.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hi Grant, Thanks for having a look at this! It would appear that a lot more is happening when explicitly selecting the sub query columns.The statistics IO output...

    STATISTICS IO for query of view not referencing sub squery columns

    Table 'UserTypes'. Scan count 0, logical reads 2, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Users'. Scan count 0, logical reads 2, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    STATISTICS IO for query of view that explicitly references sub squery columns

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Students'. Scan count 2, logical reads 4, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'UserStudentAssoc'. Scan count 2, logical reads 4, physical reads 4, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'UserTypes'. Scan count 0, logical reads 2, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Users'. Scan count 0, logical reads 2, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Thanks again, I am very interested to hear your interpretation/thoughts.

  • OK. I'm stupid.

    Of course it gets different execution plans. What I said before about the view, that it runs as is, that's flat out wrong. Instead SQL Server unpacks the view, it breaks down the select statement against the view into a new select statement when you run it. So that's perfectly normal and, in fact, you should expect to see a performance difference between your two queries. I was wrong about that too.

    That fact that you're not is the issue. I'd be curious if you ran a select against the view twice, one that included a column that is not a sub-select and a second column that is not a sub-select along with another query that includes a column this not a part of a sub-select and one that is. If possible make sure both queries return the same amount of data. I don't care about the execution plan in this case, all I want is STATISTICS IO and TIME.

    I suspect that you'll see a difference although it may be small. If my theory is right, the query with two columns that are not part of a sub-select will be a little faster.

    I think what's happening is that with all the sub-selects (not the best way to go about gathering data by the way) the unpacking process where SQL Server figures out what parts of the view it needs is taking a long time because of all the sub-selects. I'll have to set up an experiment here to test this theory, but you might be able to see it on your end.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Right. That IO is exactly what we should expect to see. Now we need to see the execution times... Actually, especially the compile time for the query.

    Are you running this on a production system? If not, run DBCC FREEPROCCACHE prior to running the tests in order to be sure we'll get a compile on the queries. But, please, don't run that in production.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Just getting back to the thread, got caught in a webex...

    I am running on my local dev box and am running dbcc dropcleanbuffers and dbcc freeproccache before each run. I'll run the queries you suggest and post the data in moment. Thanks.

  • Grant, let me know if this is not what you are looking for.

    AVERAGE EXECUTION TIME for query not referencing sub query columns

    Time Statistics

    Client processing time0.0000

    Total execution time433.9000

    Wait time on server replies433.9000

    SQL Server parse and compile time:

    CPU time = 156 ms, elapsed time = 381 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 27 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    AVERAGE EXECUTION TIME for query that explicitly references sub squery columns

    Time Statistics

    Client processing time1.5000

    Total execution time515.1000

    Wait time on server replies513.6000

    SQL Server parse and compile time:

    CPU time = 219 ms, elapsed time = 470 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 122 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Thanks!

  • Grant, it looks like your theory is, on the surface, correct. The query that does not explicitly reference the sub query columns is consistently a little bit faster than the query that explicitly references the sub query columns. They are both, however, far slower than the query of the base tables of the view. It looks like this process of reformulating the select based on the select on the view is costly. How can one check this?

    Thanks

  • Yep, you've got the answer right there.

    SQL Server parse and compile time:

    CPU time = 156 ms, elapsed time = 381 ms.

    That parse & compile time is pretty long. It's because of the unpacking process. The execution time is radically faster, 27ms vs 122ms. So it's the unpacking that is hurting you.

    I'm sorry I was giving out bad information initially.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant, no problem. Thanks for your help in figuring this out. We have a lot of views that will need review based on this and I have a new method to do so.

  • After more thought it strikes me as this is almost a bug in that it would seem that the parser/normalizer should recognize early on that the sub query is not needed since it is not selected. This way it would not convert it to it's equivalent joins etc. prior to optimizing. This is a case where the time for optimizing and reformulating far exceeds the benefit of the optimization. I recognize that the sub queries are not the most efficient means to get the data but that is how the system I am on has been developed. Is there any way to influence the optimization and reformulation steps to improve this process.

    Thanks in advance.

  • Well, the optimizer is going through a binding process on the query. It has to resolve everything involved before it begins the process of optimization. I suspect that's the step that's hurting you here. I'm still experimenting with it to replicate the behavior.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Whoops. Hold on a second. These are all views. We're dealing with nested views right? Ah, that adds a pretty severe level of complexity. It's resolving all the objects on all the views and then optimizing. Nesting views is a way to mask complexity. This can be helpful for programming, but it seldom is for optimization.

    I need to change my experiments and start again.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply