June 28, 2015 at 7:21 am
This is about the simplest rCTE that I can think of. No 8k pages anywhere yet Statistics IO reports logical reads:
SET STATISTICS IO ON;
WITH rCTE AS (
SELECT n = 1
UNION ALL
SELECT n FROM rCTE)
SELECT TOP 10 n FROM rCTE
SET STATISTICS IO OFF;
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
June 28, 2015 at 7:54 am
ChrisM@home (6/28/2015)
This is about the simplest rCTE that I can think of. No 8k pages anywhere yet Statistics IO reports logical reads:
SET STATISTICS IO ON;
WITH rCTE AS (
SELECT n = 1
UNION ALL
SELECT n FROM rCTE)
SELECT TOP 10 n FROM rCTE
SET STATISTICS IO OFF;
Slightly lower read count with this code, not much of a difference though.
😎
June 28, 2015 at 8:54 am
ChrisM@home (6/28/2015)
This is about the simplest rCTE that I can think of. No 8k pages anywhere yet Statistics IO reports logical reads:
SET STATISTICS IO ON;
WITH rCTE AS (
SELECT n = 1
UNION ALL
SELECT n FROM rCTE)
SELECT TOP 10 n FROM rCTE
SET STATISTICS IO OFF;
Looking at the execution plan, there is Table Spool, Nested Loop, Concatenation and Index Spool thrown in the mix as soon as one mentions Recursion.
June 28, 2015 at 10:04 am
Eirikur Eiriksson (6/28/2015)
ChrisM@home (6/28/2015)
This is about the simplest rCTE that I can think of. No 8k pages anywhere yet Statistics IO reports logical reads:
SET STATISTICS IO ON;
WITH rCTE AS (
SELECT n = 1
UNION ALL
SELECT n FROM rCTE)
SELECT TOP 10 n FROM rCTE
SET STATISTICS IO OFF;
Looking at the execution plan, there is Table Spool, Nested Loop, Concatenation and Index Spool thrown in the mix as soon as one mentions Recursion.
Also Assert, unless you've unlimited MAXRECURSION.
So what, in this example, has data organised as 8k pages?
Or to put it another way, why should spools be expected to work with 8k pages, when all of the other operators except those which read data directly from disk or cache such as clustered index scan, index seek etc, operate on single rows?
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
June 28, 2015 at 11:14 am
Even with that fine hypothesis, it's not conclusive proof that the table spool is being read one row at a time for each row or one page at a time for each row. And, to be sure, the execution plan that was posted on this was an estimated execution plan. The Table Spool actually has N-1 rows coming from it.
I still agree that the claim/hypothesis that the logical reads for rCTEs are actually a simple count of rows rather than pages could be true, but I'm only seeing speculation so far and no actual proof.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 28, 2015 at 11:37 am
ChrisM@home (6/28/2015)
Also Assert, unless you've unlimited MAXRECURSION.So what, in this example, has data organised as 8k pages?
Or to put it another way, why should spools be expected to work with 8k pages, when all of the other operators except those which read data directly from disk or cache such as clustered index scan, index seek etc, operate on single rows?
The Assert doesn't add to the read but it complicates the plan with an iterator and a case statement, nearly doubles the execution time.
😎
June 28, 2015 at 11:53 am
Jeff Moden (6/28/2015)
Even with that fine hypothesis, it's not conclusive proof that the table spool is being read one row at a time for each row or one page at a time for each row.
The observation is not conclusive but something is starting to both walk and look like an elephant
😎
And, to be sure, the execution plan that was posted on this was an estimated execution plan. The Table Spool actually has N-1 rows coming from it.
The execution plan is not an estimated plan although it looks like one when retrieved in this way.
June 28, 2015 at 12:11 pm
Ok. I've decided to get to the bottom of this once and for all. I've posted the following on the MVP email site. We'll see what happens...
According to the following link, logical reads are the "Number of pages read from the data cache." To me, a page is 8K bytes.
https://msdn.microsoft.com/en-us/library/ms184361.aspx
The contention is that doesn't apply when using SET STATISTICS IO ON to measure rCTEs. I've heard from several people that I trust (Paul White being one of them) over the years that the logical reads there are "just" a row count rather than a "page count". I've have not, however, seen any irrefutable proof of that (and, yeah, I'm Googled-out on the subject) although I have seen a lot of hypotheses that seem to make sense but are still no form of irrefutable proof especially in light of what the definition of a logical read is in the previous cited MS provided documentation.
To be sure, I don't have the knowledge required to write/provide such a proof but I'd definitely like to get the bottom of this claim.
So my question is pretty simple... Does anyone have some irrefutable proof that logical reads are actually just a count of rows and are not page counts when associated with rCTEs? To be sure, "irrefutable proof" to me would be some form of repeatable demonstrable code (whether the code used undocumented features or not) or a "Yes, that's absolutely true and we'll change the MS documentation" from someone on the MS team.
Thanks for the help folks.
--Jeff Moden
--Jeff Moden
Change is inevitable... Change for the better is not.
June 28, 2015 at 12:55 pm
Jeff Moden (6/28/2015)
Ok. I've decided to get to the bottom of this once and for all. I've posted the following on the MVP email site. We'll see what happens...According to the following link, logical reads are the "Number of pages read from the data cache." To me, a page is 8K bytes.
https://msdn.microsoft.com/en-us/library/ms184361.aspx
The contention is that doesn't apply when using SET STATISTICS IO ON to measure rCTEs. I've heard from several people that I trust (Paul White being one of them) over the years that the logical reads there are "just" a row count rather than a "page count". I've have not, however, seen any irrefutable proof of that (and, yeah, I'm Googled-out on the subject) although I have seen a lot of hypotheses that seem to make sense but are still no form of irrefutable proof especially in light of what the definition of a logical read is in the previous cited MS provided documentation.
To be sure, I don't have the knowledge required to write/provide such a proof but I'd definitely like to get the bottom of this claim.
So my question is pretty simple... Does anyone have some irrefutable proof that logical reads are actually just a count of rows and are not page counts when associated with rCTEs? To be sure, "irrefutable proof" to me would be some form of repeatable demonstrable code (whether the code used undocumented features or not) or a "Yes, that's absolutely true and we'll change the MS documentation" from someone on the MS team.
Thanks for the help folks.
--Jeff Moden
Looking back at that, it leaves room for further hypotheses/conjecture and so I added the following to my request on the MVP site...
To add a bit of clarity to my previous email below, the following code...
SET STATISTICS IO ON
;
WITH rCTE AS
(
SELECT N = 1
UNION ALL
SELECT N + 1
FROM rCTE
WHERE N + 1 < 1000
)
SELECT N
FROM rCTE
OPTION (MAXRECURSION 0)
;
SET STATISTICS IO OFF
;
...produces the following...
(1000 row(s) affected)
Table 'Worktable'. Scan count 2, logical reads 5995, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
What do those logical reads actually consist of? Is it a count of 8K pages or a "simple" Row Count (not so simple, though, because the ratio the logical reads (whatever they actually are) to N is always (N-1)*6+1.
Thanks again for the help.
--Jeff Moden
--Jeff Moden
Change is inevitable... Change for the better is not.
June 28, 2015 at 1:45 pm
Jeff Moden (6/28/2015)
Ok. I've decided to get to the bottom of this once and for all. I've posted the following on the MVP email site. We'll see what happens...
Thanks Jeff for the effort and looking forward to see the response.
😎
Few observations, there is quite a difference between the STATISTICS IO and sys.dm_exec_query_stats, the latter reports slightly less than 2x the count of the former.
Another is that the number of logical reads (sys.dm_exec_query_stats) is ~= (number of operators) x (number of rows) for the simplest of rCTE code.
June 28, 2015 at 10:17 pm
The reason I often refer to logical reads of worktables (including spools) being measured in rows rather than pages is an aid to understanding for people who are confused by the large number of worktable logical reads reported, compared with reads from regular tables. Certainly, worktables and spools are internally organized similarly (though not identically) to real tables. Specifically, worktables do indeed have (optimized) page storage and either a heap-like or b-tree structure.
The key difference is in the way reads are reported. Consider 100 rows on a single page in a regular user heap table. Reading all 100 rows will report a single logical read, despite the fact the same page was actually accessed 100 times. When organized in an index structure, there may be an extra logical read or two (or three...) due to the non-leaf levels of the index being accessed once at the start of the operation. So, the logical reads people are most familiar with, represent the number of *pages* read (not the number of rows).
Spools and worktables, in general, do not apply this page-level reporting policy. Depending on the type of spool (index- or heap-based) and other details, they will report at least one logical read each time a row is processed by the spool. Sometimes this includes writes as well as reads. If the primary spool is an index spool, the operation will also count any non-leaf index page accesses as well, further inflating the numbers.
The important consequence, is that worktables and spools report logical reads differently. They should not be compared directly with logical reads on user-visible tables. The reads for a worktable or spool will, in general, be proportional to the number of *rows* processed, and in the simplest cases, equal to the number of rows.
All this frequently confuses DBAs and developers who are used to tuning performance based solely on logical reads (not a school of thought I subscribe to by the way). It also explains why running an experiment to do the same thing as a spool with real tables will generally show lower logical reads (example).
June 28, 2015 at 11:53 pm
Thanks for dropping by, Paul.
While I absolutely agree that (especially with a RBAR-like rCTE that counts) single rows are processed through the Table Spool in this case, a logical read is still a logical read and each one has the IO of an 8K page. It would appear that each row processed requires an 8k logical read (actually, 3 because of the B-Tree and heap) and that's also a part of why they're relatively slow compared to set based methods or even a well written While loop in a single transaction. Even the speed of memory has its limits.
That, not withstanding, is there a way to determine how much memory IO there is for each iteration of the iterative part of the rCTE? That would be proof positive one way or the other.
Also, it doesn't appear to be so hard to duplicate what a Table Spool (Index Spool, in this case) does now that I understand it a bit more. Create a regular table with 1 integer in it and add a non-clustered index to it like the Index Spool would have on it. Then, insert just one row and you'll see 3 logical reads... just like the ratio of reads to rows for each of the Table Spools in the execution plan.
drop table jbmtest
SELECT TOP 1
N = IDENTITY(INT,1,1)
INTO dbo.JBMTest
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
CREATE UNIQUE NONCLUSTERED INDEX IX_N ON dbo.JBMTest (N);
SET STATISTICS IO ON;
INSERT INTO dbo.JBMTest DEFAULT VALUES
SET STATISTICS IO OFF;
--Jeff Moden
Change is inevitable... Change for the better is not.
June 29, 2015 at 1:30 am
Jeff Moden (6/28/2015)
While I absolutely agree that (especially with a RBAR-like rCTE that counts) single rows are processed through the Table Spool in this case, a logical read is still a logical read and each one has the IO of an 8K page. It would appear that each row processed requires an 8k logical read (actually, 3 because of the B-Tree and heap) and that's also a part of why they're relatively slow compared to set based methods or even a well written While loop in a single transaction. Even the speed of memory has its limits.That, not withstanding, is there a way to determine how much memory IO there is for each iteration of the iterative part of the rCTE? That would be proof positive one way or the other.
I can understand why people might have a mental image of a "logical read" as an 8KB-sized memory copy operation, but that is generally not accurate. Reading a page from persistent storage does indeed involve a memory copy as the page is brought into the buffer pool, but once it is there, subsequent accesses are highly optimized. More generally, the entire execution engine avoids memory copies wherever possible, preferring by-reference access. A logical read that does not involve a physical read rarely (if ever) results in an 8KB memory page copy.
There are good reasons why recursive CTEs perform the way they do, but it is not because each "logical read" results in an 8KB memory copy (they don't). Writing rows to a spool b-tree structure (stacked index spool) and deleting them (secondary stacked table spool) is not free, even though the structures themselves are optimized and not logged.
There are also the other operators in the relatively-fixed rCTE plan shape to consider, none of which are free. In addition, execution plans only present a summary view that is intended to assist users in routine query analysis - they do not tell the whole story by a long chalk. Query execution is staggeringly more complex and involved than XML or graphical plans could ever hope to convey. It is all to easy to extrapolate unsafely from a plan and a little knowledge of the internals.
So anyway, in terms of the actual work performed, it is not surprising that a "set based" or while loop can perform better in some tests. That said, the performance of rCTEs would be exactly the same if it reported page accesses instead of rows, or indeed if it reported no logical reads at all.
June 29, 2015 at 11:01 am
I used both the cte and for xml to retrieve a grouped list of about 938 rows from a table with 3458 records as the source (filtered for my use case).
The cte took 4:02 minutes and it dropped any null values in the concatenated value due to the aggregation.
The for xml took 0.26 seconds with all my null values
--pete
June 30, 2015 at 6:37 am
Repeated post
Viewing 15 posts - 46 through 60 (of 69 total)
You must be logged in to reply to this topic. Login to reply