OUTER APPLY performance

  • I have recently been introduced to the wonders of using APPLY. I've been working on a report, and I'm using OUTER APPLY on a subquery and I'm wondering if there might be some hidden performance cost that I'm not aware of.

    in the following sample, each customer is assigned to a shipment. a shipment can have multiple customers. each customer_carton is assigned to a customer and a customer can have multiple customer_cartons. I'm trying to get the bar_code of the oldest carton for each customer.

    SELECT s.shipment_id

    ,c.customer_id

    ,c.customer_name

    ,bc.bar_code

    FROM shipment s

    JOIN customer c ON s.shipment_id = c.shipment_id

    OUTER APPLY (

    SELECT cc.customer_id

    ,cc.bar_code

    ,RANK() OVER (PARTITION BY cc.customer_id

    ORDER BY cc.create_date ASC) AS bc_order

    FROM customer_carton cc

    WHERE cc.customer_id = c.customer_id ) bc

    WHERE bc.bc_order = 1

    If anyone has any advice on this, I'd love to hear it.

    /*

    here are some statistics

    customer_carton Scan count 9, logical reads 99, physical reads 0

    customer Scan count 10, logical reads 21, physical reads 0

    shipment Scan count 0, logical reads 20, physical reads 5

    the tables don't have much data in them at the moment so it's hard for me to tell

    what the performance might be like with more data

    */

  • Do you have the actual execution plan? Zip it up & post it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • My employer doesn't permit me do that (the object names in the sample are fake). Looking at the execution plan the two biggest cost % are an index scan and sort inside the OUTER APPLY subquery.

    -------------------------------

    A Clustered Index Scan on customer_carton 51%

    Predicate:

    customer_carton.customer_id as cc.customer_id =customer.customer_id as c.customer_id

    Object:

    customer_carton.PK_customer_carton cc

    Output List:

    cc.create_date, cc.bar_code

    -------------------------------

    Sort 41%

    Output List:

    customer_carton.create_date, customer_carton.bar_code

    Order By:

    customer_carton.create_date asc

    I know I need to put an index on customer_carton.customer_id, but i don't know what to do about the Order By in the Rank() function. Also, I don't know what the db engine does to interpret the OUTER APPLY.

  • Estimated costs are not always the best measures. However, a clustered index scan is the same thing as a table scan. So you need to adjust the query to make better use of the clustered index or you should consider creating a nonclustered index, preferably one that is covering.

    An index on the column being sorted could help.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Frequently an APPLY can be as fast as a JOIN, however, there are many times when it will not be. It is very rare that an APPLY is ever faster than an equivalent JOIN (exceptions occur sometimes with multi-statement TVF's).

    I would suggest that you try your query with an OUTER JOIN to see if it makes any difference:

    SELECT s.shipment_id

    ,c.customer_id

    ,c.customer_name

    ,bc.bar_code

    FROM shipment s

    JOIN customer c ON s.shipment_id = c.shipment_id

    LEFT OUTER JOIN (

    SELECT cc.customer_id

    ,cc.bar_code

    ,RANK() OVER (PARTITION BY cc.customer_id

    ORDER BY cc.create_date ASC) AS bc_order

    FROM customer_carton cc) bc

    ON cc.customer_id = bc.customer_id

    AND bc.bc_order = 1

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks for the help people. I'll have to compare the two methods once we get the database fully loaded, but right now the left outer joins are a bit faster. Propper indexing will pick things up the rest of the way.

    On a side note, has anyone found an efficient use for APPLY that doesn't involve UDF's?

  • RBarryYoung how were you able to make a join between cc.Customer_ID and bc.Customer_ID ? cc is inside your subquery and it is not visible outside that query.

    Your query should have thrown an error "The multi-part identifier "cc.Customer_ID" could not be bound".

    did you mean to write c.Customer_ID = bc.Customer_ID

    Thanks

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply