outer apply alternative

  • My goal is to get the results below. Trying to get the nearest date and result from the #Data table where the resultdate<=ProcDate. An outer apply is taking 4 hours for a 400 row table. Another more efficient way of doing this? I have several more '#Data' tables (CTE's) to join to the #patient table in the same manner. Thanks for the help

    Patient/ProcDate/ResultDate/Value

    1/2020-01-08/2020-01-02, 2

    2/2020-01-02/2020-01-01, 8

    3/2020-01-03/

     

     

    create table #patient

    (Patient Int,

    ProcDate date

    )

    Insert into #patient

    values

    (1,'2020-01-08')

    ,(2,'2020-01-02')

    ,(3,'2020-01-03')

    create table #Data

    (

    Patient Int,

    ResultDate date,

    Value Int

    )

    Insert into #Data

    values

    (1,'2020-01-02', 2)

    ,(1,'2020-01-10', 3)

    ,(2,'2020-01-01', 8)

    ,(2,'2020-01-04', 5)

    ,(2,'2020-01-11', 4)

    ,(3,'2020-01-08', 6)

  • 4 hours seems excessive. Are the join columns on the two tables indexed? Did you look at the execution plan?

    What's your SQL query look like? Something like this? (The FREEPROCCACHE is to make sure I'm not reading cached data pages)

    DBCC FREEPROCCACHE;

    SET STATISTICS IO ON;
    SET STATISTICS TIME ON;

    SELECT p.Patient
    , p.ProcDate
    , ca.ResultDate
    , ca.Val
    FROM #patient p
    CROSS APPLY (SELECT TOP 1 *
    FROM #Data d
    WHERE d.Patient = p.Patient
    AND d.ResultDate <= p.ProcDate
    ORDER BY d.ResultDate) ca

    But without information about your indexing on the two tables, there's not much to suggest.

    To make sure I wasn't just talking smack, I tested this out on AdventureWorks... And the query took hardly any time at all. So, there must be something you're not telling us? Here's the messages I got back:

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    (28586 rows affected)

    Table 'Worktable'. Scan count 19820, logical reads 104313, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    Table 'SalesOrderHeader'. Scan count 1, logical reads 689, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    Table 'Customer'. Scan count 1, logical reads 37, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 437 ms, elapsed time = 893 ms.

    How many records are in your two tables, and what indexes do you have on each?

    • This reply was modified 3 years, 5 months ago by  pietlinden.
    • This reply was modified 3 years, 5 months ago by  pietlinden. Reason: More information
  •  

    Using the representative data you have supplied do you actually need to use cross apply, a simple left  join will return your expected results

     

     select a.patient,ProcDate,ResultDate,Value
    from #patient a
    left join #data b on a.patient=b.patient
    and resultdate<=isnull(procdate,getdate())

     

    TEST

    ***The first step is always the hardest *******

  • boehnc wrote:

    My goal is to get the results below. Trying to get the nearest date and result from the #Data table where the resultdate<=ProcDate. An outer apply is taking 4 hours for a 400 row table. Another more efficient way of doing this? I have several more '#Data' tables (CTE's) to join to the #patient table in the same manner. Thanks for the help

    Patient/ProcDate/ResultDate/Value

    1/2020-01-08/2020-01-02, 2

    2/2020-01-02/2020-01-01, 8

    3/2020-01-03/

    create table #patient (Patient Int, ProcDate date )

    Insert into #patient

    values

    (1,'2020-01-08') ,(2,'2020-01-02') ,(3,'2020-01-03')

    create table #Data

    ( Patient Int, ResultDate date, Value Int

    )

    Insert into #Data

    values (1,'2020-01-02', 2) ,(1,'2020-01-10', 3) ,(2,'2020-01-01', 8) ,(2,'2020-01-04', 5) ,(2,'2020-01-11', 4) ,(3,'2020-01-08', 6)

    Even a bad outer apply shouldn't take 4 hours.  Please post the code and the table defintion so we can have a look-see.  Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply