Cross Apply Executing Too Many Times

Question

Post reply

Cross Apply Executing Too Many Times

nhansen pcc

Say Hey Kid

Points: 706
More actions
March 11, 2013 at 1:02 pm

#290863

I have a rather complex set of queries that are mostly embedded in table valued functions and call one another in chains. I've had a number of very peculiar performance issues that seem to come and go, but that don't seem to be directly related to the queries themselves (heavily indexed tables).
Currently I'm examining a particular cross apply that seems to be behaving strangely. I'm simplifying a bit, but the two following queries return the "same" result, just different ways; the main difference is that the first one has an additional column hardcoded instead of relying on the cross join to fill it in.
select 'LA13G12ST-05', 10000, pidcm.met, pidcm.quantityFound
from table_valued_function('LA13G12ST-05', 10000, '00000630', 1, 'S0000013122', 'LA2013') pidcm
union all
select 'LA13G12ST-05', 20000, pidcm.met, pidcm.quantityFound
from table_valued_function('LA13G12ST-05', 20000, '00000630', 1, 'S0000013122', 'LA2013') pidcm
select 'LA13G12ST-05', pidc.[Line No_], pidcm.met, pidcm.quantityFound
from (select distinct pidc.[Line No_]
from table1 pid
join table2 pidc
on pid.[Pidc Code] = pidc.Code
where pid.Code = 'LA2013'
and pid.[Line No_] = 10000) pidc
cross
apply table_valued_function('LA13G12ST-05', pidc.[Line No_], '00000630', 1, 'S0000013122', 'LA2013') pidcm
Yes, the subquery does return exactly two rows, 10000 and 20000
So, when I run the both queries, I get the expected (and identical) results from both
LA13G12ST-051000003
LA13G12ST-052000000
LA13G12ST-051000003
LA13G12ST-052000000
However, the statistics are very different:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 3 ms.
(2 row(s) affected)
Table '#71DCD509'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#70E8B0D0'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 813 ms, elapsed time = 809 ms.
(2 row(s) affected)
Table '#71DCD509'. Scan count 73, logical reads 78, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table2'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table1'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 28859 ms, elapsed time = 28898 ms.
Curious on why the latter required 73 times as many scans, I ran the SQL profiler, and found that the table valued function is running 72 times in the cross apply example.
It may be purely a coincidence, but there happen to be 72 rows in the subquery if you remove the where clause from it. So, it looks to me as if SQL Server is unrolling my subquery, moving its "where" to the outer query, executing the cross apply, and then filtering. Unfortunately, this ends up being much more expensive.
So, is there some way to convince the query optimizer to do otherwise? I had expecting that adding the "distinct" to the subquery (not technically needed) would help the optimizer know it should do the inner part first, but it didn't.
Being from an Oracle background, the first thing that comes to mind is to put the inner query in a cursor, and loop through its results calling the function for each, so I can control the number of function calls. Such behavior seems to be frowned upon in this forum, and disparagingly referred to as RBAR. So, I'm open to alternatives.
I'm sure someone will ask whether it is possible to avoid table valued functions and roll all the logic into a single SQL statement. It would be very difficult (perhaps impossible) because of two things:
1) Analytic functions aren't fully available in SQL Server 2008R2 (and I can't easily upgrade to 2012); specifically, I need to limit something by a running total, and the "rows between" clause isn't available yet on the analytic sum function.
2) In several cases, completely different queries need to run depending upon some other setting or function result (case statement runs different queries). It might be possible perhaps to join all the possible results and discard the ones that don't apply, but I'm not sure that would be good for performance anyway.
Edit:
I found a workaround, sort of; I can select the subquery into a temporary table and then cross apply against that temporary table instead of the subquery. That seems to solve the particular problem, but it also forces this function to be a multi-statement table valued function, whereas before it could have been in-line. Since it is calling a multi-statement TVF (one with a gigantic case), I suppose the distinction is largely irrelevant.
I don't suppose there is some optimizer hint that says "this part is expensive"? I tried force order, but it didn't help (maybe it applies only to joins, not applies).

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

ian 70072 Newbie Points: 8 More actions · Answer 1

ian 70072

Newbie

Points: 8

January 14, 2019 at 9:29 am

#2018846

OPTION (RECOMPILE)
fixed this issue for me.

Jonathan AC Roberts SSCoach Points: 18380 More actions · Answer 2

CROSS APPLY is a row level operator. So your table_valued_function will be run once for each row the rest of the query returns.
So my guess is that the query:
select * from (select distinct pidc.[Line No_] from table1 pid join table2 pidc on pid.[Pidc Code] = pidc.Code where pid.Code = 'LA2013' and pid.[Line No_] = 10000) pidc
will return over 70 rows.

nhansen pcc Say Hey Kid Points: 706 More actions · Answer 3

Thanks for the input. But since I posted this question six years ago, I don't have all the code instrumented and ready to test suggestions right now.

Jonathan AC Roberts SSCoach Points: 18380 More actions · Answer 4

LOL, Sorry I didn't notice the original post was that old. Someone else brought it back from the dead.

Jeff Moden SSC Guru Points: 1004475 More actions · Answer 5

Jonathan AC Roberts - Monday, January 14, 2019 10:35 AM
CROSS APPLY is a row level operator. So your table_valued_function will be run once for each row the rest of the query returns.
So my guess is that the query:
select * from (select distinct pidc.[Line No_] from table1 pid join table2 pidc on pid.[Pidc Code] = pidc.Code where pid.Code = 'LA2013' and pid.[Line No_] = 10000) pidc
will return over 70 rows.

I'm pretty sure that's not true for iTVFs. To be clear as to what I mean, while it may look like an iTVF is working as if it were being called for every row, it works more like a view and isn't actually called once for each row. That's a part of what makes iTVFs to much faster than even an equivalent mTVF.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jonathan AC Roberts SSCoach Points: 18380 More actions · Answer 6

Jeff Moden - Monday, January 14, 2019 5:54 PM
Jonathan AC Roberts - Monday, January 14, 2019 10:35 AM
CROSS APPLY is a row level operator. So your table_valued_function will be run once for each row the rest of the query returns.
So my guess is that the query:
select * from (select distinct pidc.[Line No_] from table1 pid join table2 pidc on pid.[Pidc Code] = pidc.Code where pid.Code = 'LA2013' and pid.[Line No_] = 10000) pidc
will return over 70 rows.
I'm pretty sure that's not true for iTVFs. To be clear as to what I mean, while it may look like an iTVF is working as if it were being called for every row, it works more like a view and isn't actually called once for each row. That's a part of what makes iTVFs to much faster than even an equivalent mTVF.

It might not be called once for each row but the results are the same as if it were called once for each row. So it's a good way to think about what results you are going to get. And in this case it looks like the optimiser hasn't worked out a better way to do it.

Jeff Moden SSC Guru Points: 1004475 More actions · Answer 7

Jonathan AC Roberts - Monday, January 14, 2019 7:36 PM
Jeff Moden - Monday, January 14, 2019 5:54 PM
Jonathan AC Roberts - Monday, January 14, 2019 10:35 AM
CROSS APPLY is a row level operator. So your table_valued_function will be run once for each row the rest of the query returns.
So my guess is that the query:
select * from (select distinct pidc.[Line No_] from table1 pid join table2 pidc on pid.[Pidc Code] = pidc.Code where pid.Code = 'LA2013' and pid.[Line No_] = 10000) pidc
will return over 70 rows.
I'm pretty sure that's not true for iTVFs. To be clear as to what I mean, while it may look like an iTVF is working as if it were being called for every row, it works more like a view and isn't actually called once for each row. That's a part of what makes iTVFs to much faster than even an equivalent mTVF.
Jeff Moden - Monday, January 14, 2019 5:54 PM
Jonathan AC Roberts - Monday, January 14, 2019 10:35 AM
CROSS APPLY is a row level operator. So your table_valued_function will be run once for each row the rest of the query returns.
So my guess is that the query:
select * from (select distinct pidc.[Line No_] from table1 pid join table2 pidc on pid.[Pidc Code] = pidc.Code where pid.Code = 'LA2013' and pid.[Line No_] = 10000) pidc
will return over 70 rows.
I'm pretty sure that's not true for iTVFs. To be clear as to what I mean, while it may look like an iTVF is working as if it were being called for every row, it works more like a view and isn't actually called once for each row. That's a part of what makes iTVFs to much faster than even an equivalent mTVF.
It might not be called once for each row but the results are the same as if it were called once for each row. So it's a good way to think about what results you are going to get. And in this case it looks like the optimiser hasn't worked out a better way to do it.

I just don't want people walking away from this thinking that iTVFs are as bad as a Scalar UDF or and mTVF. For me, I think of a CROSS APPLY to an iTVF is more like joining to a view but with a lot more control.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)