March 20, 2014 at 8:20 am
I’ve never written a query with multiple APPLY joins before and I’m running into some troubles with my first one. The below SQL statement runs within 10 seconds if I comment out either one of the APPLY joins and its corresponding field columns. However, when I try to execute with both APPLY joins, the query runs indefinitely. The longest I’ve waited before cancelling it is 90 minutes.
Now, I know there are probably other ways I could write this query to get me the results I’m looking for. I’m posting this on the board because I’m curious about finding out why multiple APPLY joins could cause SQL Server to run away. I’m hoping to gain some insight so that I can better understand how APPLY joins work so that in case I have a big need to do this again in the future (without suitable workarounds) I can code it correctly.
Here are some things I’ve tried so far…
1.Changed the States table into a subquery that only returns a single state
2.Change all the references inside the APPLY subqueries so that they had different aliases (just in case they were conflicting with each other).
3.Changed the CROSS applies to OUTER applies. States has 50 records and only 32 have matching permit data so the 18 extra iterations using OUTER APPLY don’t impact performance any when an APPLY is used by itself.
SELECT s.state_name
, COUNT(DISTINCT DUPS.PermitNumber) AS NumOfDupPermits
, SUM(DistinctPermits) AS DistinctPermits
FROM States S
CROSS APPLY (SELECT w.StateID, COUNT(*) as DistinctPermits
FROM Permit P INNER JOIN Well W1 ON P.WellID = w.WellID
WHERE w.StateID=S.Stateid
GROUP BY P.PermitNumber, w.StateID) STD
CROSS APPLY (select P.PermitNumber, w.StateID AS State_IDs, COUNT(*) as counts
FROM Permit P INNER JOIN Well W2 ON P.WellID = w.WellID
WHERE w.StateID=S.Stateid
GROUP BY P.PermitNumber, w.StateID
HAVING count(*)>1) DUPS
group by s.state_name
March 20, 2014 at 10:45 am
Do you have the DDL for these tables and some sample data?
March 20, 2014 at 4:26 pm
Take a look at the estimated execution plan. It will tell you the choices SQL Server is making based on the code you provided and the statistics available on the tables and indexes.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 21, 2014 at 3:28 am
Check your table aliases:
SELECT s.state_name
, COUNT(DISTINCT DUPS.PermitNumber) AS NumOfDupPermits
, SUM(DistinctPermits) AS DistinctPermits
FROM States S
CROSS APPLY (
SELECT w.StateID, COUNT(*) as DistinctPermits
FROM Permit P
INNER JOIN Well W1 ON P.WellID = w.WellID -- W1 / w
WHERE w.StateID = S.Stateid
GROUP BY P.PermitNumber, w.StateID
) STD
CROSS APPLY (
select P.PermitNumber, w.StateID AS State_IDs, COUNT(*) as counts
FROM Permit P
INNER JOIN Well W2 ON P.WellID = w.WellID -- W2 / w
WHERE w.StateID = S.Stateid
GROUP BY P.PermitNumber, w.StateID
HAVING count(*)>1
) DUPS
group by s.state_name
Since this query as it stands cannot run, it has to be different to the actual query you are running. The differences could be significant. If you post the estimated execution plan as Grant suggests, we get to see your actual query, and a few other useful clues too.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 27, 2014 at 9:45 am
ChrisM@Work (3/21/2014)
Check your table aliases:
SELECT s.state_name
, COUNT(DISTINCT DUPS.PermitNumber) AS NumOfDupPermits
, SUM(DistinctPermits) AS DistinctPermits
FROM States S
CROSS APPLY (
SELECT w.StateID, COUNT(*) as DistinctPermits
FROM Permit P
INNER JOIN Well W1 ON P.WellID = w.WellID -- W1 / w
WHERE w.StateID = S.Stateid
GROUP BY P.PermitNumber, w.StateID
) STD
CROSS APPLY (
select P.PermitNumber, w.StateID AS State_IDs, COUNT(*) as counts
FROM Permit P
INNER JOIN Well W2 ON P.WellID = w.WellID -- W2 / w
WHERE w.StateID = S.Stateid
GROUP BY P.PermitNumber, w.StateID
HAVING count(*)>1
) DUPS
group by s.state_name
Since this query as it stands cannot run, it has to be different to the actual query you are running. The differences could be significant. If you post the estimated execution plan as Grant suggests, we get to see your actual query, and a few other useful clues too.
Yes, sorry about posting invalid SQL. Right before I made this post, I tried messing with the aliases (#2 in my original post) and when that was unsuccessful I thought I had reverted all the changes I made but definitely missed some.
Here's the functional SQL.
SELECT s.state_name
, COUNT(DISTINCT DUPS.StatePermitSerialNumber) AS NumOfDupPermits, SUM(DUPS.AmendedDups) As AmendedDups --<<comment out these fields when you comment out the DUPS apply join
, SUM(DistinctPermits) AS DistinctPermits --<<comment out this field when you comment out the STD apply join
FROM States S
CROSS APPLY (SELECT W1.StateID, COUNT(*) as DistinctPermits
FROM Permit P1 INNER JOIN Well W1 ON P1.WellID = W1.WellID
WHERE w1.StateID=S.Stateid
GROUP BY p1.StatePermitSerialNumber, w1.StateID) STD
CROSS APPLY (select P2.StatePermitSerialNumber, w2.StateID AS State_IDs, COUNT(*) as counts,SUM(DISTINCT ISNULL(P2.AmendedPermit,0)) AmendedDups
FROM Permit P2 INNER JOIN Well W2 ON P2.WellID = W2.WellID
WHERE w2.StateID=S.Stateid
GROUP BY p2.StatePermitSerialNumber, w2.StateID
HAVING count(*)>1) DUPS
group by s.state_name
I've attached the estimated execution plan for the query with both joins and then I've included the actual execution plan for two queries where one of the joins is commented out. Thanks for your help!
March 27, 2014 at 2:15 pm
I think you can simplify the query something like below, although some tweaking might need done:
SELECT
s.state_name,
derived.NumOfDupPermits, derived.AmendedDups, derived.DistinctPermits
FROM (
SELECT
W.StateID,
COUNT(P.StatePermitSerialNumber) - COUNT(DISTINCT P.StatePermitSerialNumber) AS NumOfDupPermits,
CASE WHEN COUNT(P.StatePermitSerialNumber) - COUNT(DISTINCT P.StatePermitSerialNumber) <= 0 THEN 0
ELSE SUM(DISTINCT ISNULL(P.AmendedPermit,0)) END AS AmendedDups,
COUNT(DISTINCT P.StatePermitSerialNumber) AS DistinctPermits
FROM Permit P
INNER JOIN Well W ON
P.WellID = W.WellID
GROUP BY
W.StateID
) AS derived
INNER JOIN States S ON
S.StateID = derived.StateID
ORDER BY
s.state_name
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 27, 2014 at 3:24 pm
Thanks.
So I actually came up with an alternate solution shortly after making this post. However, getting an alternate solution was not my goal with this post. My goal with this post was to understand SQL Server better and try to understand why two APPLY joins causes this query to run away. I haven't done many APPLY joins so if there's something I'm doing wrong here that causes multiple APPLY joins to go crazy I want to know.
March 27, 2014 at 5:46 pm
bncaffey (3/27/2014)
Thanks.So I actually came up with an alternate solution shortly after making this post. However, getting an alternate solution was not my goal with this post. My goal with this post was to understand SQL Server better and try to understand why two APPLY joins causes this query to run away. I haven't done many APPLY joins so if there's something I'm doing wrong here that causes multiple APPLY joins to go crazy I want to know.
First, how many rows are in Well and Permit tables?
Second, would you please post the actual code? All of your execution plans say they're using "tblWell" and "tblPermit". Yes, we'll make fun of you for having "tbl" in your database but that's only temporary. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
March 28, 2014 at 12:05 pm
Well has 5.6M rows and Permit has 1.4M rows.
The SQL posted is the actual working SQL. I just started at this company (yes, these tables are in a company data mart) so views were created to hide the tbl-ness!
March 31, 2014 at 9:03 am
So does anyone have experience writing queries with multiple APPLY joins?
March 31, 2014 at 9:32 am
bncaffey (3/31/2014)
So does anyone have experience writing queries with multiple APPLY joins?
Yes, but I've not observed this before. I can't see anything obvious in the plans either. If you have the patience and the time to do it, the actual plan might offer more clues.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply