September 20, 2010 at 10:55 am
Very interesting post from Paul regarding the Spooling and Trace 8690... If I turn that trace flag on in SQL2008, the performance improves spectacularly.
(Also, just as a note, my testing both in SQL2005 and SQL2008 are both on AdventureWorks... the old "traditional" one... not the newer AdventureWorks2008 database).
Anyway, running Jeff's new test, it was very enlightening (without Paul's trace flag)... In short, SQL2008 puts together a lousy plan for the Tally table, and SQL2005 doesn't.
Attached are the output files... but here's the gist:
SQL2008 (64-bit):
Tally: 314370ms CPU, 87639ms duration (Note the huge CPU, indicating all 4 processors were working hard)
XML: 1451ms CPU, 1417ms duration
Loop: 171ms CPU, 197ms duration
SQL2005 (32-bit):
Tally: 78ms CPU, 44ms duration (Whoa!!)
XML: 1371ms CPU, 1163ms duration
Loop: 172ms CPU, 205ms duration
I've also attached the query plans of the SQL2005 and the SQL2008 queries on the tally table.
As Paul said, SQL2008 uses a spool, and SQL2005 does not. And because of that, they attack the query from different perspectives... you can see the order of the tables as they are accessed in the STATISTICS output (and in the attached actual execution plans).
If you look at the SQL2008 query plan... OUCH! The spool on the tally table spit out 121,317,000 rows... in other words, 1000 rows for each of the 121317 rows in the SalesOrderDetail table... That spool was BUSY!
The SQL2005 plan uses no spool and therefore the logical reads are much lower.
--Brad
September 20, 2010 at 11:00 am
Paul White NZ (9/20/2010)
rob.symonds (9/20/2010)
Anyone else a little concerned about the 2k8 results for the inline Tally?The latest scripts and results simply show that the optimizer can generate a poor plan, especially when it isn't provided with good quality information.
XML operations generally have only one possible plan implementation, so the optimizer can't do much with them, except move them (as a unit) around a bit. Also, estimated row counts from XML operations are based on fixed guesses (there are no statistics available), which are only slightly modified by the contextual query information. These guesses are generally very low - a couple of hundred rows is typical.
Wrapping the Tally in an in-line function allows the optimizer to expand the function definition into the wider query before optimization. Usually, this is a good thing, as it allows for many plan alternatives to be considered. In this case, a lack of good information, the cast from varchar to int on ProductID, and various other factors, conspire to produce a duff plan for the Tally solution.
It just so happens that the XML implementation's lack of statistics, low row estimates, and limited optimization possibilities happen to produce a plan that works well on the AdventureWorks query. Importantly, wrapping the XML approach in an in-line function doesn't hurt as much because the optimizer can't do much more than plonk the whole thing into the plan as a whole.
So, in summary, Jeff and Brad are both correct, and both methods are best, as is the CLR one. :w00t:
Paul
That's pretty much the conclusion I came to especially because of the (apparently) non-sargable ON condition that 2K5 managed to forgive adn 2K8 didn't. But you know I have to know the full answer... I'm still going to install 2K8 DevEd on my shiny new laptop tonight and try a couple of sargeable methods without actually changing the code within the Tally function.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 20, 2010 at 11:03 am
September 20, 2010 at 11:04 am
Brad Schulz (9/20/2010)
Very interesting post from Paul regarding the Spooling and Trace 8690... If I turn that trace flag on in SQL2008, the performance improves spectacularly.(Also, just as a note, my testing both in SQL2005 and SQL2008 are both on AdventureWorks... the old "traditional" one... not the newer AdventureWorks2008 database).
Anyway, running Jeff's new test, it was very enlightening (without Paul's trace flag)... In short, SQL2008 puts together a lousy plan for the Tally table, and SQL2005 doesn't.
Attached are the output files... but here's the gist:
SQL2008 (64-bit):
Tally: 314370ms CPU, 87639ms duration (Note the huge CPU, indicating all 4 processors were working hard)
XML: 1451ms CPU, 1417ms duration
Loop: 171ms CPU, 197ms duration
SQL2005 (32-bit):
Tally: 78ms CPU, 44ms duration (Whoa!!)
XML: 1371ms CPU, 1163ms duration
Loop: 172ms CPU, 205ms duration
I've also attached the query plans of the SQL2005 and the SQL2008 queries on the tally table.
As Paul said, SQL2008 uses a spool, and SQL2005 does not. And because of that, they attack the query from different perspectives... you can see the order of the tables as they are accessed in the STATISTICS output (and in the attached actual execution plans).
If you look at the SQL2008 query plan... OUCH! The spool on the tally table spit out 121,317,000 rows... in other words, 1000 rows for each of the 121317 rows in the SalesOrderDetail table... That spool was BUSY!
The SQL2005 plan uses no spool and therefore the logical reads are much lower.
--Brad
Very cool... thanks for the testing, Brad. Heh... and I'm glad that the Tally Table really isn't the pig that it appeared to be here. This type of problem could occur in other areas other than just a Tally Table split and I tickled that the real problem has been ferreted out.
Thank you everyone for your participation in this. Brad, thanks for humoring me until we got this all hammered out.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 20, 2010 at 11:12 am
Brad Schulz
Attempted to view your plans Microsoft Internet Explorer 7 . Result:
The XML page cannot be displayed
Cannot view XML input using style sheet. Please correct the error and then click the Refresh button, or try again later.
--------------------------------------------------------------------------------
Switch from current encoding to specified encoding not supported. Error processing resource 'http://www.sqlservercentral.co...
<?xml version="1.0" encoding="utf-16"?>
September 20, 2010 at 11:16 am
bitbucket-25253 (9/20/2010)
Brad SchulzAttempted to view your plans Microsoft Internet Explorer 7.
I blame this site. In Chrome, I always have to right-click the .sqlplan files, Save As, and then open from SSMS. Clicking on them directly just opens the XML in a browser window - not the desired result.
September 20, 2010 at 11:21 am
bitbucket-25253 (9/20/2010)
Brad SchulzAttempted to view your plans Microsoft Internet Explorer 7 . Result:
The XML page cannot be displayed
Cannot view XML input using style sheet. Please correct the error and then click the Refresh button, or try again later.
I see that Paul beat me to it, but he's right.
I didn't save the plans as images, but rather I saved the plans as .SQLPLAN files, which in reality means that they are saved in XML format, suitable only for viewing in SSMS. If you save those .SQLPLAN files onto your desktop (or wherever you wish on your computer), you should be able to double-click on them to have them open in SSMS.
September 20, 2010 at 11:23 am
Brad Schulz (9/20/2010)
Attached are the output files... but here's the gist:SQL2008 (64-bit):
Tally: 314370ms CPU, 87639ms duration (Note the huge CPU, indicating all 4 processors were working hard)
XML: 1451ms CPU, 1417ms duration
Loop: 171ms CPU, 197ms duration
SQL2005 (32-bit):
Tally: 78ms CPU, 44ms duration (Whoa!!)
XML: 1371ms CPU, 1163ms duration
Loop: 172ms CPU, 205ms duration
Hey Brad,
Get you and your flash parallel plans 😛
Anyway, thanks for posting the 2005 plans - it saved me a very tedious job. I wonder what the serial plans look like? (Not important - just curious).
I'm going with 'luck' as the cause of the 2005 plan - just look at the estimated versus actual row counts above the nested loops join of Tally and SalesOrderDetail! When costing information is this wrong, anything could happen!
Paul
September 20, 2010 at 11:24 am
Paul White NZ (9/20/2010)
It's just a 'performance spool' (sort + lazy spool combination) introduced based on inaccurate/missing statistical information. If you disable the performance spool optimization with DBCC TRACEON (8690), you get a much better plan.
Just to be sure, you're [font="Arial Black"]not [/font]recommending that this flag be set permanently, right? I don't want anyone to think that this should be done on a permanent basis. In order to do this dorrectly, we really need to make the query sargable.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 20, 2010 at 11:28 am
mazzz (9/20/2010)
JeffI've just seen your update in the OP - do you need any more results?
My home setup is Win 7, Intel core duo CPU, 4GB RAM, 64bit, SQL2008 R2 if it's of any interest
I'll never turn down a set of data points, Mazzz. I'd love to see the results from above including that second test I asked folks with 2K8 to run. Thanks for your effort on this.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 20, 2010 at 11:29 am
Jeff Moden (9/20/2010)
Just to be sure, you're [font="Arial Black"]not [/font]recommending that this flag be set permanently, right? I don't want anyone to think that this should be done on a permanent basis. In order to do this dorrectly, we really need to make the query sargable.
No. In all the excitement, I forgot the usual warnings.
The trace flag affects other optimizer features, not just perf spool. You would not enable it on a production system without instructions from Microsoft CSS.
I recommend only setting it for the current session (so not using the '-1' part), on a test server, turning it off afterward, and issuing DBCC FREEPROCCACHE to clear any duff plans from cache.
Not sure why you mention SARGable - a lack of index seeks isn't the issue, cost estimation is.
Paul
September 20, 2010 at 11:32 am
Brad Schulz (9/20/2010)
Very interesting post from Paul regarding the Spooling and Trace 8690... If I turn that trace flag on in SQL2008, the performance improves spectacularly.(Also, just as a note, my testing both in SQL2005 and SQL2008 are both on AdventureWorks... the old "traditional" one... not the newer AdventureWorks2008 database).
Anyway, running Jeff's new test, it was very enlightening (without Paul's trace flag)... In short, SQL2008 puts together a lousy plan for the Tally table, and SQL2005 doesn't.
Attached are the output files... but here's the gist:
SQL2008 (64-bit):
Tally: 314370ms CPU, 87639ms duration (Note the huge CPU, indicating all 4 processors were working hard)
XML: 1451ms CPU, 1417ms duration
Loop: 171ms CPU, 197ms duration
SQL2005 (32-bit):
Tally: 78ms CPU, 44ms duration (Whoa!!)
XML: 1371ms CPU, 1163ms duration
Loop: 172ms CPU, 205ms duration
I've also attached the query plans of the SQL2005 and the SQL2008 queries on the tally table.
As Paul said, SQL2008 uses a spool, and SQL2005 does not. And because of that, they attack the query from different perspectives... you can see the order of the tables as they are accessed in the STATISTICS output (and in the attached actual execution plans).
If you look at the SQL2008 query plan... OUCH! The spool on the tally table spit out 121,317,000 rows... in other words, 1000 rows for each of the 121317 rows in the SalesOrderDetail table... That spool was BUSY!
The SQL2005 plan uses no spool and therefore the logical reads are much lower.
--Brad
Brad... what do you get in 2K8 for that same test with the Trace Flag turn on?
--Jeff Moden
Change is inevitable... Change for the better is not.
September 20, 2010 at 11:36 am
Paul White NZ (9/20/2010)
Jeff Moden (9/20/2010)
Just to be sure, you're [font="Arial Black"]not [/font]recommending that this flag be set permanently, right? I don't want anyone to think that this should be done on a permanent basis. In order to do this dorrectly, we really need to make the query sargable.No. In all the excitement, I forgot the usual warnings.
The trace flag affects other optimizer features, not just perf spool. You would not enable it on a production system without instructions from Microsoft CSS.
I recommend only setting it for the current session (so not using the '-1' part), on a test server, turning it off afterward, and issuing DBCC FREEPROCCACHE to clear any duff plans from cache.
Not sure why you mention SARGable - a lack of index seeks isn't the issue, cost estimation is.
Paul
Ah... thanks.
The reason why I mention SARGable is because I believe that it might be able to do a better job at cost estimation if it were SARGable.. As Brad points out, it's currently doing the equivalent of a Cross Join between the 1000 elements in the parameter and 121 K rows in the SalesOrderDetails table.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 20, 2010 at 11:39 am
Jeff
Would you like a run on 2008 R2? Will have to be this weekend at earliest, cause I'm not planning to upgrade before then.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 20, 2010 at 11:47 am
Jeff ... ran test with the trace flag on (and yes I turned it off when the test completed).
Posted my results at Posted Today @ 11:25 AM
Viewing 15 posts - 121 through 135 (of 214 total)
You must be logged in to reply to this topic. Login to reply