Starting with "WITH"

  • I'm not a huge fan of recursive scenarios, largely (I suspect) because I work with data warehousing and we tend to like flattening hierarchies. 😀

    I do find it a shame that a conversation about CTEs focusses almost entirely on recursion, which I think misses a huge part of the power of CTEs. I know I use them extensively, purely as a way of breaking my SQL code into more modular chunks, many of which are re-useable across multiple queries. I find that the part-by-part nature of CTEs tends to make them far easier to understand, and the resulting queries build up naturally in a "layered" fashion.

    In contrast, it's much harder to re-use sections of code within a query even if you are using in-line sub-queries without having to resort to temporary tables. Sometimes temporary tables are still quicker, and they too can act as named common expressions. But they force SQL Server to commit to some form of I/O to write out the results, whereas a CTE may not need to do that. I've found that a CTE not only makes the resulting query easier to write and easier to read/maintain, but it also makes it easier to tune!

    So all up, I can heartily recommend getting used to breaking down your queries using common table expressions. As a side note, CTEs are not limited to SQL Server; they're rather useful in Oracle as well.

    As for recursion? In comparison to how many times I write other queries, recursive solutions feature in possibly 1 in 1000 or thereabouts. And yes, I work with everything from organisation charts through to general ledgers, all of which are hierarchical. So don't think that the value of CTEs lies in the ability to write recursive solutions; something that rare is quite incidental. For me, they're all about making code easier.

  • Hey Bruce!

    Bruce W Cassidy (11/14/2011)


    I'm not a huge fan of recursive scenarios, largely (I suspect) because I work with data warehousing and we tend to like flattening hierarchies.

    Sure, but recursive CTEs aren't just for hierarchy problems - see Chris' examples for cool ways to solve 'running total'-type problems, for example.

    I do find it a shame that a conversation about CTEs focusses almost entirely on recursion, which I think misses a huge part of the power of CTEs.

    Ah it's not really a shame is it? The original question was answered, and the discussion veered off in the direction of recursive CTEs, and became quite interesting (from my point of view anyway). There's nothing to stop us having a chat about normal CTEs too.

    I know I use them extensively, purely as a way of breaking my SQL code into more modular chunks, many of which are re-useable across multiple queries.

    Well yes, but once you start reusing them across multiple queries, might they not be better off as real views? Or perhaps you are changing them slightly between uses, in which case you could think about using an in-line table-valued function (iTVF). After all, non-recursive CTEs are nothing more than in-line view definitions, and iTVFs are parameterized views.

    I find that the part-by-part nature of CTEs tends to make them far easier to understand, and the resulting queries build up naturally in a "layered" fashion.

    I tend to agree, but I know others don't so much. Some people prefer to nest subqueries, which performs much the same function but can make it easier to test parts of the code in isolation. There is also an argument that layering CTEs can get out of hand, and fool people into thinking more procedurally about a problem than might be necessary. I am happy to write queries either way depending on the client's preference.

    In contrast, it's much harder to re-use sections of code within a query even if you are using in-line sub-queries without having to resort to temporary tables. Sometimes temporary tables are still quicker, and they too can act as named common expressions. But they force SQL Server to commit to some form of I/O to write out the results, whereas a CTE may not need to do that. I've found that a CTE not only makes the resulting query easier to write and easier to read/maintain, but it also makes it easier to tune!

    Again there is a balance here. For some problems, writing an small intermediate result to a temporary table can give the optimizer a better shot at getting a good execution plan. Temporary objects can provide accurate cardinality information, and temporary tables can also provide distribution statistics - neither of these are available when using CTEs. As far as I/O is concerned, it depends: if the server is not under memory pressure, the index and data pages for the temporary object are likely to remain memory-resident over its lifetime. Logging overhead is also generally negligible due to specific tempdb logging optimizations. In my experience, using intermediate storage can be good for stability, since the optimizer has more and better information available when circumstances change.

    So all up, I can heartily recommend getting used to breaking down your queries using common table expressions. As a side note, CTEs are not limited to SQL Server; they're rather useful in Oracle as well.

    Sure...just don't let long nested lists of CTEs become the only tool available. They really are in-line views, and many people have strong feelings about designs that nest views deeply...

  • Edward-445599 (11/3/2011)


    I have recently seen a few T-SQL queries starting with “WITH” I understanding what’s happening I am just surprised and a little embarrassed I haven’t seen this before, I have failed to get any documentation explaining this T-SQL, has any one got any link’s explaining this concept in detail (I don’t need the example explained I understand it)?

    WITH example(example_ID) AS

    (Select Min(ID) FROM Example_table GROUP BY example_ID,type)

    Select *from table2 T2

    Inner join example on T2.id = example.example_ID

    Replying to the op: do not forget a semi-colon before any WITH cte AS... construct or your code will fail with a less-than-informative error. This is one implementation in SQL Server where a statement terminator is not optional.

    Another gotcha when starting out using CTEs: you can query a CTE once and only once, in the query immediately following the declaration of the CTE. This, for me, is often the biggest limitation of them and why I generally use them only for things like pre-defining aggregates on data that need acting upon. If you need to refer to the result set of a CTE in more than 1 subsequent query, you'll need a temp table/table variable.

    Rich

  • SQL Kiwi (11/14/2011)


    Hey Bruce!

    Hey Paul! 😀

    I know I use them extensively, purely as a way of breaking my SQL code into more modular chunks, many of which are re-useable across multiple queries.

    Well yes, but once you start reusing them across multiple queries, might they not be better off as real views? Or perhaps you are changing them slightly between uses, in which case you could think about using an in-line table-valued function (iTVF). After all, non-recursive CTEs are nothing more than in-line view definitions, and iTVFs are parameterized views.

    You know, I'm not sure I've played with iTVF's. Parameterisation could make them really handy. Do they get named like CTEs so you can use them multiple times in the subsequent query without having to repeat yourself?

    I'm not saying CTEs are the be-all and end-all of structuring SQL code, but they are certainly one easy technique to help do so. And yes, well aware temporary tables are still a good idea; CTEs just give us different options so we can pick and choose as appropriate. I'll often find I will start with a layered CTE-style approach and then start peeling off CTEs into temporary tables.

    Aside from the tuning thing, temporary tables are also useful across multiple queries but you also get to resuse the data. So they are stomething I still utilise frequently. However, I remember my code prior to CTEs, and without nesting like crazy or repeating big chunks of code (for example, in a self-referential pattern), the only ways to modularise were to either break chunks down into real views or to use temporary tables. CTEs just add in some much-appreciated flexibility.

    Sure...just don't let long nested lists of CTEs become the only tool available. They really are in-line views, and many people have strong feelings about designs that nest views deeply...

    Absolutely. As with anything, it's a matter of balance. They're a great tool; they're not the only tool.

  • Bruce W Cassidy (11/14/2011)


    You know, I'm not sure I've played with iTVF's. Parameterisation could make them really handy. Do they get named like CTEs so you can use them multiple times in the subsequent query without having to repeat yourself?

    I'm referring to the in-line variety of ordinary functions (BOL entry extract below to make it clearer):

    Inline Table-Valued Functions


    [font="Courier New"]

    CREATE FUNCTION [ schema_name. ] function_name

    ( [ { @parameter_name [ AS ] [ type_schema_name. ] parameter_data_type

    [ = default ] [ READONLY ] }

    [ ,...n ]

    ]

    )

    RETURNS TABLE

    [ WITH <function_option> [ ,...n ] ]

    [ AS ]

    RETURN [ ( ] select_stmt [ ) ]

    [ ; ]

    [/font]

    They are expanded into the calling query before optimization, and take parameters. This makes them behave like a VIEW that can take parameters.

  • SQL Kiwi (11/13/2011)


    Efficiency, it seems to me, comes down to doing the least work possible to return the required results.

    Agreed. But then there's that problem of defining efficiency. Some define it as simple duration. Others define it as "least resources" and duration be damned. I tend to judge on a combination of all the characteristics. Although I normally shoot for simple duration, even a well indexed triangular join and recursive CTE's don't sit well with me because of the logical reads involved. And, yeah... I have the same problem justifying even iTVFs like a splitter iTVF that uses a physical Tally table simply because of all the reads. Yep... I understand that most of those are logical reads from memory but it's still a form of pressure on the IO system.

    I did some additional testing on converting Adjacency Lists to Hierarchical paths using rCTEs and WHILE loops (both using "Lasagna" methods at each level... setbased levels, both "loop" through the levels). At 100 nodes, the rCTE is only about a 3rd faster (on my old legacy box)and matches the WHILE loop for duration at about 100,000 nodes (although the reads for the rCtE are just insane by then). At a million rows (not uncommon for MLM's and the like), the WHILE loop was a pretty clear winner over the rCTE. Of course, the only reason why I bought that up is because I'm still convinced that rCTEs are, shall we say for the comfort of everyone, similar to explicit looping methods (except for the comparatively insane number of logical reads for rCTEs) and the only reason why rCTEs appear more efficient is because they generate only 1 execution plan instead of one for each INSERT level in the WHILE loop which, of course, takes extra time.

    I have to respecfully disagree on calling Triangular Joins with an aggregate "set based" simply because the "sets" it processes contain so many identical rows in each set processed by the aggregate. Along with a few other caveats, a part of my definition for "set based" processing is touching each row only once or, at least, as few times as possible. Of course, like most rules, there are exceptions to that "rule". And example of such an exception is using a cross join simply to generate a row count because the goal is to simply and quickly generate a set of rows.

    Tough subject, though... Defining what RBAR and set based actually are and what the exceptions are. Perhaps we should limit our descriptions of code to "crap" and "non-crap" code. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • SQL Kiwi (11/14/2011)


    I'm referring to the in-line variety of ordinary functions.

    Oh, well sure, those I use. However, proliferating the schema with lots of little table-valued functions isn't always the best approach. Certainly for common stuff I am using ofte, they're another good option.

  • SQL Kiwi (11/13/2011)


    L' Eomot Inversé (11/13/2011)


    OK, going to 1000 yours has 67% of the logical (row) reads that the naive one has and only half the CPU usage. Neither has any physical reads. But it takes 25% more elapsed time. I get that consistently on my laptop. I wonder why?

    Not sure - the CROSS JOIN rCTE is consistently faster for me, though the difference becomes more pronounced as N gets larger. FYI this is the script I used:

    SET STATISTICS IO, TIME ON;

    WITH

    cteCounter AS

    ( --=== Counter rCTE counts from 0 to 11

    SELECT 0 AS N --This provides the starting point (anchor) of zero

    UNION ALL

    SELECT N + 1 --This is the recursive part

    FROM cteCounter

    WHERE N < 10000

    ) --=== Add the counter value to a start date and you get multiple dates

    SELECT StartOfMonth = DATEADD(mm,N,'2011')

    INTO #a

    FROM cteCounter

    OPTION (MAXRECURSION 0);

    WITH rCTE AS

    (

    SELECT 1 AS N

    UNION ALL

    SELECT N

    FROM rCTE

    CROSS JOIN (

    VALUES

    (1),(1),(1),(1),(1),

    (1),(1),(1),(1),(1),

    (1),(1),(1),(1),(1),

    (1),(1),(1),(1),(1)) AS V (v)

    )

    SELECT TOP (10000)

    StartOfMonth =

    DATEADD(MONTH,

    ROW_NUMBER() OVER (ORDER BY @@SPID)-1,

    CONVERT(DATE, '20110101'))

    INTO #b

    FROM rCTE

    ORDER BY ROW_NUMBER() OVER (ORDER BY @@SPID)

    OPTION (MAXRECURSION 0);

    DROP TABLE #a, #b

    SET STATISTICS IO, TIME OFF

    Table 'Worktable'. Scan count 2, logical reads 60007, physical reads 0

    CPU time = 125 ms, elapsed time = 128 ms.

    Table 'Worktable'. Scan count 2, logical reads 40569, physical reads 0

    CPU time = 78 ms, elapsed time = 99 ms.

    For clarity though, I just wanted to show a recursive plan without the Filter Jeff was objecting to, and where the number of executions on the lower level was much reduced. I'm not advocating rCTEs for this sort of task.

    edit: Make sure you run this with Actual Execution Plan off, without any traces running, that sort of thing. Also, one thing I forgot to mention - the original CTE doesn't like having an ORDER BY N added (the plan needs an explicit sort). I added one to my code because I wanted to be sure of the presentation ordering (without relying on undocumented behaviour) and it comes for free 🙂

    rCTE's are very sensitive to seemingly small changes. Decreasing the number of cross-joined rows and increasing the number of cross-joins leads to this:

    ;WITH rCTE_BGT AS

    (

    SELECT 1 AS N

    UNION ALL

    SELECT N+1

    FROM rCTE_BGT

    CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS V1 (v)

    CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS V2 (v)

    CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS V3 (v)

    CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS V4 (v)

    CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS V5 (v)

    ),Numberlist AS (

    SELECT TOP (10000)

    n = ROW_NUMBER() OVER (ORDER BY @@SPID)-1

    FROM rCTE_BGT

    )

    SELECT StartOfMonth = DATEADD(mm,n,'2011')

    INTO #c

    FROM Numberlist

    ORDER BY n

    OPTION (MAXRECURSION 0);

    -- Table 'Worktable'. Scan count 2, logical reads 20735, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    -- CPU time = 47 ms, elapsed time = 76 ms.

    -- Paul's original:

    -- Table 'Worktable'. Scan count 2, logical reads 40569, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    -- CPU time = 110 ms, elapsed time = 147 ms.

    -- Jeff's original:

    -- Table 'Worktable'. Scan count 2, logical reads 60001, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    -- CPU time = 140 ms, elapsed time = 165 ms.

    What's really surprising (at least to me) is when you compare the table spool operators for the three queries:

    Jeff's: 10,000 rows, 1 execution (10,000 operations)

    Paul's: 500 rows, 22 executions (11,000 operations)

    Mine: 1 row, 100002 executions (100,002 operations)

    -and where the costs are allocated. My query takes 100% of the cost of the batch if all three queries are run together.

    For Jeff's query, the insert into the temp table takes 100% of the cost of the query.

    For Paul's query, the table insert takes 96% and row construction in the recursive part of the query takes 4%

    For my query, the table insert takes 0% - almost the entire cost is attributed to the cross joins.

    Where are those logical reads represented in the QP?

    Why does the query with by far the highest table spool activity perform the fastest?

    Does the QP bear any relation whatsoever to what's actually happening under the hood?

    As Paul's already pointed out, not all reads are equal.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Paul's suggestion of piling tons of rows into the recursive part to speed up counting gave me an idea. Well, two ideas. The first was to minimise the activity of the table spool operator. This is quite straightforward - put 10 rows into the anchor part of my query above.

    The second was to see how this might lend itself to string-splitting since rCTE's had performed so poorly in the exhaustive testing performed by Jeff et al in the Tally-Oh article.

    Here's the code, I've not optimised it so there's almost certainly a few tweaks to be made:

    WITH Splitter AS (

    SELECT

    ItemNumber = 0,

    Item = LEFT(@pString, CHARINDEX(@pDelimiter,@pString+@pDelimiter,0)-1)

    UNION ALL

    SELECT

    ItemNumber = ROW_NUMBER() OVER(ORDER BY n), -- or @@SPID

    Item = SUBSTRING(@pString,1+n,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,1+n),0)-(1+n),8000))

    FROM (

    SELECT TOP (ISNULL(DATALENGTH(@pString),0)) n = n1+n2+n3+n4

    FROM (VALUES (0),(100),(200),(300),(400),(500),(600),(700),(800),(900)) t2(n2)

    CROSS APPLY (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t3(n3)

    CROSS APPLY (VALUES (0),(10),(20),(30),(40),(50),(60),(70),(80),(90)) t1(n1)

    CROSS APPLY (VALUES (0),(1000),(2000),(3000),(4000),(5000),(6000),(7000),(8000),(9000)) t4(n4)

    ) d

    WHERE SUBSTRING(@pString,n,1) = @pDelimiter

    )

    SELECT ItemNumber, Item

    FROM Splitter;

    I ran the code in the cool super-comparator framework written by Jeff which was posted with the Tally-Oh article. The first result was promising, but there was an error in each new batch of numbers. That was a quick fix, then I ran it again. Here's the results:

    [font="Courier New"]

    SplitterName MINduration MAXduration SUMduration AVGduration

    Split 0.01000 2.89300 15.10600 0.308285

    DelimitedSplit8K_rCTE 0.01000 19.84600 106.72700 2.178102

    DelimitedSplit8K_T1 0.00600 20.13600 108.61300 2.216591

    CJM01 0.01000 19.64000 108.76200 2.219632

    Nadrek0Based 0.01000 20.61300 115.06500 2.348265

    DelimitedSplit8K 0.01000 20.20000 115.35200 2.354122[/font]

    Not bad for RBAR 😎

    Only one problem. It's not a rCTE :ermm:


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • What is the code used for the function named "Split", Chris?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Same as the original, Jeff - CLR. I use T1 at work because the output of Split() is unexpected with delimiters such as ' of '.

    FWIW I tried a number of wacky and weird alternatives in your testing framework, and they all eventually bottomed out at about the same performance as T1, which is probably the minimum cost of generating the rows, doing all the SUBSTRINGs + a few CHARINDEXs.

    A rCTE based on 8k rows in the recursive part (so anchor + 1 recursion at run time) comes in at twice the time of T1 but seems to scale quite well.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Viewing 11 posts - 46 through 55 (of 55 total)

You must be logged in to reply to this topic. Login to reply