a looping subquery whose result set is used by parent? do I use a CTE?

  • Look here's what I've tried. I have been reading about CTE and I must be fried but it is not going anywhere for me. spinning my wheels horrifically.

    Please point out error. I want only students taking biology. I am told CTE is the elegant solution for layering logic in a drill down query...but I can't even get past first CTE.

    --First CTE

    WITH CTE AS (

    SELECT MAX(revision_cycle) AS MaxRevission, studentID FROM grade_status GROUP BY studentID)

    SELECT g.grade_assigned, g.studentID From grade_status g

    INNER JOIN CTE c

    ON g.studentID = c.studentID

    AND c.MaxRevission = g.revision_cycle

    select course

    from grade_status join

    cte on

    cte.statusID = g.statusID

    where course = 'biology'

    -

  • xenophilia (4/6/2012)


    Look here's what I've tried. I have been reading about CTE and I must be fried but it is not going anywhere for me. spinning my wheels horrifically.

    Please point out error. I want only students taking biology. Once I have that down I need to know how to keep adding layers of CTE so I have a long way to go yet and something small is stumping me. I am told CTE is the elegant solution for layering logic in a query...but I can't even get past first CTE.

    --First CTE

    WITH CTE AS (

    SELECT MAX(revision_cycle) AS MaxRevission, studentID FROM grade_status GROUP BY studentID)

    SELECT g.grade_assigned, g.studentID From grade_status g

    INNER JOIN CTE c

    ON g.studentID = c.studentID

    AND c.MaxRevission = g.revision_cycle

    select course

    from grade_status join

    cte on

    cte.statusID = g.statusID

    where course = 'biology'

    you can only have one query after the cte. you can combine the queries like so

    WITH CTE AS (

    SELECT MAX(revision_cycle) AS MaxRevission, studentID FROM grade_status GROUP BY studentID)

    SELECT g.grade_assigned, g.studentID, g.course -- put your other column here

    From grade_status g

    INNER JOIN CTE c

    ON g.studentID = c.studentID

    AND c.MaxRevission = g.revision_cycle

    where course = 'biology' -- put the where here


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • A CTE is part of the following INSERT, SELECT, or UPDATE statement and does not exist outside of that scope. If you need to use the same set of records in multiple queries, you will need to 1) define the CTE for each query, 2) use a table variable, or 3) use a temp table.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • OK. Looks good capn.hector. Please read on.

    I need to build a query for a report and it has to be filtered on a number of levels. I have a large dataset (about 2.5 million records) in a table.

    1) first I have to reduce that table to only those records with max rev

    2) result set is then reduced to those records belonging to a certain sub-type (eg. course)

    3) result set is reduced to only those records belonging to a certain org

    4) result set is reduced to those records belonging to a project under the org

    5) etc.

    This query will drive a report and CTEs were recommended for ease of troubleshooting when validating query. Can I use multiple CTE for this? The person who told me to do this is no slouch.

    -

  • the reason I believe I need multiple CTEs is because each subsequent CTE will introduce additional table joins which I do not need for the intial data set.

    -

  • with out ddl and sample data for all the tables i cant be sure.

    the first thing i would do is add a where clause to the query creating the CTE to limit the records as much as possible. then with joins to the tables with the different information you can use join conditions and where clause filters to get every thing out. its possible it may be a 30 line complicated query with 6 joins to different tables (ive done this quite often) but it may be possible to have just one cte and one query. if it ends up needing multiple CTE's the syntax is as follows:

    WITH e1(n) AS (SELECT 1 UNION ALL ---only one with

    SELECT 1 UNION ALL

    SELECT 1 UNION ALL

    SELECT 1 UNION ALL

    SELECT 1 UNION ALL

    SELECT 1 UNION ALL

    SELECT 1 UNION ALL

    SELECT 1 UNION ALL

    SELECT 1 UNION ALL

    SELECT 1), --- use a comma between each cte

    e2(N) AS (SELECT 1 FROM e1 a, e1 b)

    --then run your select statement.

    SELECT TOP 100 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM e2


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • Is my understanding correct, that each subsequent CTE can pull in additional joins for the gradually expanding breadth of the query?

    I want to use CTE to avoid joining too many tables before applying filters that are relavent to only a subset of the tables.

    Is this the proper use of CTEs?

    for example.

    1) CTE for dataset from one table

    2) CTE for filtering dataset combining first CTE with additional table

    3) CTE for filtering dataset combining second CTE with yet more tables

    4) etc.

    and when troubleshooting I could then highlight (and execute) only those CTEs pertaining to my specific validation focus?

    -

  • thats one way but an inefficient way. all the cte's will be compiled into a single query by the optimizer but may not achieve the optimal plan if you have multiple cte's.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • capn (or drew), please recommend an approach. I have a weekend and many reports. I was about to invest a lot of time learning how to use CTEs on the basis of a recommendation from someone more experienced than I. But, this person generally works in OLTP databases and is a tester. Therefore, I would be happy to get a consensus viewpoint on this from you and others.

    In a situation where I need to narrow a result set many times, in order to provide drill down and rollup information, what is the best approach to building a reporting query that can easily be dissected and validated...piece by piece?

    Drew says " A CTE is part of the following INSERT, SELECT, or UPDATE statement and does not exist outside of that scope. If you need to use the same set of records in multiple queries, you will need to 1) define the CTE for each query, 2) use a table variable, or 3) use a temp table."

    But, can you elaborate. Do you mean only 1 CTE? By table variable, what do you mean? How does one decide how much of the logic should go into the CTE before putting the rest in the other query? I already have queries that 'mostly work' but they are not readable and disconcerting for the validation effort. I want to retrofit to make it easier to methodically evaluate my queries.

    -

  • There are just too many variables to recommend one approach over another. You'll need to test several different approaches to see which one works best with your data.

    A table variable is what it sounds like—a variable of type table. For an example see the following DECLARE @local_variable (Transact_SQL), specifically Example C.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • capn.hector (4/6/2012)


    thats one way but an inefficient way. all the cte's will be compiled into a single query by the optimizer but may not achieve the optimal plan if you have multiple cte's.

    But again, you have to test each solution. In one query multiple cte's may work fine and in another it may actually be better to break things down and use templorary tables.

  • xenophilia (4/6/2012)


    I already have queries that 'mostly work' but they are not readable and disconcerting for the validation effort. I want to retrofit to make it easier to methodically evaluate my queries.

    Also be aware that what might be easiest to validate might not be the best for performance. You'll need to balance the two options and test for both accuracy and performance. Sorry, but it's one of those "it depends" answers. We can definitely help out more if you post the rest of the DDL and data.

    MWise

  • OK, thank you for the replies. It is good mentorship from high level that helps when with foray into lower level decisions. Balance. Thanks.

    -

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply