First attempt at a blog

  • Could a few folks take a look at

    http://bit.ly/lEFRls

    and let me know what you think? It's a comparison of execution plans for 4 ways to get the same result.

    Thanks.

    ab

  • alan.berger (4/27/2011)


    Could a few folks take a look at

    http://bit.ly/lEFRls

    and let me know what you think? It's a comparison of execution plans for 4 ways to get the same result.

    Thanks.

    ab

    I can speak for others but I won't usually click on an unknown URL on this forum unless I know you very, very well. I recommend you save the 4 execution plans as execution plans that folks can open and attach them to your post.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff...

    Good point. I never would have thought of that as a concern. The full, safe, we know where that is url is

    http://alan-berger.blogspot.com/2011/04/outer-join-subquery-cte-or-udf-you-be.html

    But should I just put the whole text here? I didn't want to put it in the blog section in case I was out of my (league/mind - you choose).

    Thanks for the input!

    ab

  • Ah... my bad. I thought you were asking a question and I was suggesting that you attach the Execution Plans you were having trouble with.

    First, thanks for posting a real URL. I had a look. Nicely done. Just a concern of mine that Grant Fritchey will back me up on... the execution plans frequently lie. Sometimes, really, really badly. I don't even both looking at the costs in execution plans anymore. Oh, yes. I still rely heavily on execution plans but not their cost indicators.

    Instead, I've taken to relying on SQL Profiler. It's remarkable in finding things that costs and execution plans will never tell you. Things like, what is the real cost of using a recursive CTE in terms of Reads?

    My recommendation for that good blog would be to run the code with SQL Profiler turned on for that given spid and then post the results along with your execution plans.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff...going to work on that later tonight.

    Besides using the profiler, is there any obvious (or not so obvious) way to get the QOH?

    Truly appreciate your time!!

    ab

  • I believe you can use certain DMV's to check on that. I just don't remember what they are.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Posted on your blog.

    Steve

  • To really test query performance, you need bigger datasets than this. To legitimize a few apples:apples points, try a test like this:

    SET NOCOUNT ON ;

    IF OBJECT_ID(N'tempdb..#T1') IS NOT NULL

    DROP TABLE #T1 ;

    IF OBJECT_ID(N'tempdb..#T2') IS NOT NULL

    DROP TABLE #T2 ;

    IF OBJECT_ID(N'tempdb..#T3') IS NOT NULL

    DROP TABLE #T3 ;

    IF OBJECT_ID(N'tempdb..#T4') IS NOT NULL

    DROP TABLE #T4 ;

    SET STATISTICS IO ON ;

    SET STATISTICS TIME ON ;

    SELECT pp.ProductID,

    pp.Name AS ProductName,

    SUM(ppi.Quantity) AS Quantity

    INTO #T1

    FROM Production.Product pp

    LEFT JOIN Production.ProductInventory ppi

    ON pp.ProductID = ppi.ProductID

    GROUP BY pp.ProductID,

    pp.Name ;

    SELECT pp.ProductID,

    pp.Name AS ProductName,

    (SELECT SUM(ppi.Quantity)

    FROM Production.ProductInventory ppi

    WHERE ppi.ProductID = pp.ProductID) AS Quantity

    INTO #T2

    FROM Production.Product pp ;

    ;

    WITH SummaryRecords

    AS (SELECT ProductID,

    SUM(Quantity) AS Quantity

    FROM Production.ProductInventory

    GROUP BY ProductID)

    SELECT pp.ProductID,

    pp.Name AS ProductName,

    sr.Quantity

    INTO #T3

    FROM Production.Product pp

    LEFT JOIN SummaryRecords sr

    ON pp.ProductID = sr.ProductID ;

    SELECT pp.ProductID,

    pp.Name AS ProductName,

    u.Quantity

    INTO #T4

    FROM Production.Product pp

    LEFT JOIN dbo.udf_qohAggregateTbl() u

    ON pp.ProductID = u.ProductID ;

    SET STATISTICS TIME OFF ;

    SET STATISTICS IO OFF ;

    SELECT *

    FROM #T1

    EXCEPT

    SELECT *

    FROM #T2 ;

    SELECT *

    FROM #T2

    EXCEPT

    SELECT *

    FROM #T3 ;

    SELECT *

    FROM #T3

    EXCEPT

    SELECT *

    FROM #T4 ;

    The statistics data will tell you more about the workload on a query than the "estimated costs". The "costs" are junk data, as Jeff already pointed out. I've seen them lie far too often.

    The Time stats are pretty much useless on this size of dataset, I just included them because I'm in the habit of doing so, and kept it there for consistency. The I/O stats tell the real story. Mainly, that the four queries all require exactly the same amount of table reading.

    The final bit, after turning the stats off, is to make sure the queries really are producing identical results. Always important to know when testing variations.

    I removed the Order By clauses, because that's unnecessary for performance-testing the base queries. Also, having them all insert into temp tables eliminates the overhead of returning a resultset to the client, which can skew time stats in some cases.

    As far as other versions of the query, the only ones I can readily think of would be a scalar UDF instead of the inline sub-query, or a Cross Apply in the From clause:

    SELECT ProductID,

    Name AS ProductName,

    Quantity

    FROM Production.Product

    CROSS APPLY (SELECT SUM(Quantity) AS Quantity

    FROM Production.ProductInventory

    WHERE ProductID = Product.ProductID) AS Sub ;

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Only took a month, but attempt 2 posted: alan-berger.blogspot.com

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply