June 4, 2012 at 10:27 am
I have a main record table, joined in one-to-many relationships with around 20 aux tables. When filtering the main table, I may use conditions from fields containing text data, as well as from the linked tables via the foreign key. I've been playing with various methods of assembling the queries, and come up with several alternatives, but I'm sort of stuck on which (if any) are better or worse. The text fields are obvious - no choice there, but there are several ways to word the conditions regarding the aux tables. I tried three here: selecting on the foreign key field in the main table, selecting on the primary key in the aux table joined via a left join (there may be other conditions that will evaluate to TRUE, in which case I want the record, even though it didn't pass this test) and selecting on an embedded select clause joined with an inner join.
#1. SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
LEFT JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID
WHERE P.GenusAutoID = 35423
#2. SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
LEFT JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID
WHERE G.GenusAutoID = 35423
#3.SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
INNER JOIN
(SELECT GenusAutoID, Genus FROM dbo.vwTableOfGenuss WHERE GenusAutoID = 35423) G
ON P.GenusAutoID = G.GenusAutoID
The three execution plans are attached as well - I hope I did that right. These queries are extremely simplified, for test purposes. The actual queries return more fields and join more tables.
Versions 2 and 3 appear identical, so it seems that the optimizer is parsing them down to the same final result. Plan 1 looks simpler in the graphic view, but of course that means nothing from an execution standpoint. Here the differences are trivial anyway, since the condition is simple - the response time is instant. But what happens when I join more tables, and add text fields?
Is there likely to come a point when the specified subquery and inner join of #3 make a difference?
If I have more fields, and text fields, will the optimizer decide that it's easier to do everything from the main table, rather than first gathering many aux tables?
Is the optimizer even capable of making such a decision, or will it obediently test exactly the fields I specify, even if testing the other side of the linking key might be better?
It seems to me that maybe it is that smart:
- In the first query, I get the same results whether I specify a left or inner join, although the query plans are different.
- In the second query, both the execution plan and results are the same for left or inner. In fact, it looks like the same plan as for Query #1 using the inner join.
- In the third query, the results are wrong (all records) when I use left, correct when I use inner. (Obviously, the plans are different.) The correct plan again looks the same as the inner join plan for Query #1 and both versions of Query #2.
Is this even worth investigating? Theoretically, many conditions are possible, but in practice, I would rarely expect more than two or three. If this sort of nit-picking generates benefits only at some fairly high level of complexity, I can use my time better elsewhere.
June 4, 2012 at 10:37 am
I can't see a difference between queries 1 and 2... Did you paste the right one in
You should also note that because you have a filter on the right-hand table of a left join that filters out nulls, you've essentially converted the join to an inner join. Your query is actually this:
SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
INNER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID
WHERE G.GenusAutoID = 35423
That's why 3 is identical in plan and meaning.
If you want a left join, the query needs to be
SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
LEFT JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID
WHERE G.GenusAutoID = 35423 OR G.GenusAutoID IS NULL
In which case query 2 would be logically different queries from 3 and return different results.
Simple rewrites like this are not going to result in a different plan (unless your syntax change results in a different query) as by the time the optimiser sees it, the query's been converted into a logical tree.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 4, 2012 at 11:11 am
GilaMonster (6/4/2012)
I can't see a difference between queries 1 and 2... Did you paste the right one in
It looks like it is in the WHERE condition and the alias is G for one and P for the other. To the OP, this will make no difference as the join has already occurred (I believe).
Jared
CE - Microsoft
June 4, 2012 at 11:22 am
Ok, now I see the difference...
That's changed the logical meaning of the query. The first query is a left join, the filter is done on the let-hand table of a left join. The second query, as I posted, is actually an inner join because of the filter on the right-hand table of a left join that eliminates nulls. So, the three queries you posted are actually this logically:
1)
SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID
WHERE P.GenusAutoID = 35423
2
SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
INNER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID
WHERE G.GenusAutoID = 35423
3
SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
INNER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID
WHERE G.GenusAutoID = 35423
If you took your third query and turned it to a left join, it would simplify to this
SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID AND G.GenusAutoID = 35423
Which is logically different from all of the others.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 4, 2012 at 12:59 pm
GilaMonster (6/4/2012)
Ok, now I see the difference...
Sorry, I ran out for some food.
That's changed the logical meaning of the query. The first query is a left join, the filter is done on the let-hand table of a left join. The second query, as I posted, is actually an inner join because of the filter on the right-hand table of a left join that eliminates nulls.
So SQL Server will actually use the join it thinks is best, in spite of what I specify? Interesting, I didn't know the optimizer went that far with it.
If you took your third query and turned it to a left join, it would simplify to this
SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID AND G.GenusAutoID = 35423
Which is logically different from all of the others.
Would that make more sense, to do it that way? Again, I'm joining around twenty tables, so the syntax gets a little complicated, especially since I'm assembling the query in code. If I can use a left join everywhere, instead of having to code sometimes an inner, sometimes a left, it would make things a bit simpler.
And I'm a little confused here:
LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID AND G.GenusAutoID = 35423
A join should mention both tables in each clause, shouldn't it? What is 'G.GenusAutoID = 35423' doing in the ON clause?
June 4, 2012 at 1:31 pm
For instance, there's no reason to use a JOIN if you use fields from one table. Instead of doing this:
SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
LEFT JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID
WHERE P.GenusAutoID = 35423
You could simply do this:
SELECT P.PodrobnostiAutoID
FROM dbo.vwPodrobnosti P
WHERE P.GenusAutoID = 35423
Unless there's something else missing in the requirements that you haven't shared with us.
So SQL Server will actually use the join it thinks is best, in spite of what I specify? Interesting, I didn't know the optimizer went that far with it.
SQL Server won't change the join type, but if you include a column from a table joined with an OUTER JOIN in the WHERE clause, it will work as an INNER JOIN.
A join should mention both tables in each clause, shouldn't it? What is 'G.GenusAutoID = 35423' doing in the ON clause?
It's not necessary to mention both tables in each clause, just the table you're joining. This is useful on outer joins.
June 4, 2012 at 1:43 pm
pdanes (6/4/2012)
GilaMonster (6/4/2012)
Ok, now I see the difference...Sorry, I ran out for some food.
That's changed the logical meaning of the query. The first query is a left join, the filter is done on the let-hand table of a left join. The second query, as I posted, is actually an inner join because of the filter on the right-hand table of a left join that eliminates nulls.
So SQL Server will actually use the join it thinks is best, in spite of what I specify? Interesting, I didn't know the optimizer went that far with it.
No, your query form (the predicate in the where clause) changed the join to an inner join, so SQL did exactly what you told it.
Again, I'm joining around twenty tables, so the syntax gets a little complicated, especially since I'm assembling the query in code. If I can use a left join everywhere, instead of having to code sometimes an inner, sometimes a left, it would make things a bit simpler.
You use whatever join gives you the results you need. If you need only rows that match to the second table, use inner join. If you also need the rows that don't have a match, you use left join. It's about the logic of the query.
And I'm a little confused here:
LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID AND G.GenusAutoID = 35423
A join should mention both tables in each clause, shouldn't it? What is 'G.GenusAutoID = 35423' doing in the ON clause?
No, a join can have any predicate in it. It's doing exactly what the filter in the subquery did when you had left join, filtering before joining. Makes no difference with an inner join, changes the query's logic with a left join.
None of this is about performance. It's just about the correct query to get the data that you want.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 4, 2012 at 1:56 pm
Luis Cazares (6/4/2012)
Unless there's something else missing in the requirements that you haven't shared with us.
Yes, there is. As I mentioned in the original post, these queries are extremely simplified for the purposes of this test. I do actually use other fields from these joined tables, and in fact, have displays that concatenate some of these fields and display the results, whether an individual record does or does not have an entry from a particular table. Some of it is taxonomy, for example, and I must display the entire taxonomic tree (Class, Order, Family, Genus...), even though some entries may not be present in a particular record. This why I use Left Joins, because I need the record regardless, and do the filtering on which records I actually retrieve elsewhere. Otherwise, you're right, it would be simpler to just filter on the main table and not join at all. Unfortunately, I can't do that.
So SQL Server will actually use the join it thinks is best, in spite of what I specify? Interesting, I didn't know the optimizer went that far with it.
SQL Server won't change the join type, but if you include a column from a table joined with an OUTER JOIN in the WHERE clause, it will work as an INNER JOIN.
I guess that makes sense - if I have a condition on the table I'm joining, I'm restricting the possible joined members, which is much the same as actually using an Inner Join. It's sometimes tough to wrap my skull around what's actually being specified at each stage in the query.
A join should mention both tables in each clause, shouldn't it? What is 'G.GenusAutoID = 35423' doing in the ON clause?
It's not necessary to mention both tables in each clause, just the table you're joining. This is useful on outer joins.
Well, in that case I still don't understand what the 'G.GenusAutoID = 35423' phrase is doing here. When you say join ON something, you have to say what in each table is to be the link. Here it's a simple condition, testing against a literal. What role does it play in this join clause?
June 4, 2012 at 2:01 pm
pdanes (6/4/2012)
When you say join ON something, you have to say what in each table is to be the link.
No you don't.
This is perfectly legal and valid.
SELECT * FROM t1 INNER JOIN t2 ON 1 = 0
It'll never return data, but it's perfectly legal. See the link I posted.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 4, 2012 at 2:09 pm
GilaMonster (6/4/2012)
See the link I posted.
Sorry, Gail, my display didn't refresh before I posted that reply to Luis. Looking at that now...
June 5, 2012 at 3:41 am
pdanes (6/4/2012)
I actually have two 'conditions' of sorts - one is the join, the other is the filter. Here is the actual SELECT statement in al its glory:SELECTdbo.fnsFormatAkcesit(A.AkcesitPred, A.Akcesit, A.Rok) AS Akc,
dbo.fnsFormatDrEvid(P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra) AS DrEvid,
dbo.fnsTaxonomieFormatted(tGe.Genus, tSGe.SubGenus, tSp.Species, tSSp.SubSpecies, P.Authors, tF.Family,
tOr.[Order], tGr.[Group]) AS Taxonomie, dbo.fnsStratigrafie(tSy.System, tSe.Series,
tSt.Stage, tSSt.SubStage, tLU.LithographicUnit, tLSU.LithographicSubUnit,
tZ.Zone) AS Stratigrafie, tC.Cesky + N' - ' + tC.Country AS Kountry, CASE WHEN Depozitar IS NULL
THEN '' ELSE Depozitar END + CASE WHEN UlozisteDocasne IS NULL THEN '' ELSE ' {' + UlozisteDocasne + '}' END AS Lokace, P.PodrobnostiAutoID,
P.AkcesAutoID, P.EvidenceGroup, P.GroupAutoID, P.OrderAutoID, P.FamilyAutoID,
P.GenusAutoID, P.SubGenusAutoID, P.SpeciesAutoID, P.SubSpeciesAutoID,
P.SystemAutoID, P.SeriesAutoID, P.StageAutoID, P.SubStageAutoID,
P.LithographicUnitAutoID, P.LithographicSubUnitAutoID, P.ZoneAutoID, P.CountryAutoID,
P.DepozitarAutoID, P.UlozisteDocasne, P.PDAutoID, P.ODAutoID, P.OriginAutoID,
P.PocetKusu, P.OTHER_NO, P.Poznamka, P.Description, P.RockType,
P.Preserv, P.Original, P.Authors, P.Lokalita, P.IDNeDruheEvidence,
P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra, A.Rok, A.AkcesitPred, A.Akcesit,
A.KusuVKatalogu, P.NepublikovatYN, tT.Zkratka, P.TypAutoID, left(CONVERT(varchar(100), tNI.Datum, 21),10) As Datum, tNI.Clovek
FROM dbo.Podrobnosti P INNER JOIN
dbo.Akces A ON P.AkcesAutoID = A.AkcesAutoID LEFT OUTER JOIN
dbo.vwNejnovejsiInventarizace tNI ON P.PodrobnostiAutoID = tNI.PodrobnostiAutoID LEFT OUTER JOIN
dbo.TableOfTyps tT ON P.TypAutoID = tT.TypAutoID LEFT OUTER JOIN
dbo.TableOfSystems tSy ON P.SystemAutoID = tSy.SystemAutoID LEFT OUTER JOIN
dbo.vwTableOfDepozitarsFormatted tDF ON P.DepozitarAutoID = tDF.DepozitarAutoID LEFT OUTER JOIN
dbo.TableOfLithographicSubUnits tLSU ON P.LithographicSubUnitAutoID = tLSU.LithographicSubUnitAutoID LEFT OUTER JOIN
dbo.TableOfCountrys tC ON P.CountryAutoID = tC.CountryAutoID LEFT OUTER JOIN
dbo.TableOfLithographicUnits tLU ON P.LithographicUnitAutoID = tLU.LithographicUnitAutoID LEFT OUTER JOIN
dbo.TableOfSeriess tSe ON P.SeriesAutoID = tSe.SeriesAutoID LEFT OUTER JOIN
dbo.TableOfStages tSt ON P.StageAutoID = tSt.StageAutoID LEFT OUTER JOIN
dbo.TableOfZones tZ ON P.ZoneAutoID = tZ.ZoneAutoID LEFT OUTER JOIN
dbo.TableOfSubStages tSSt ON P.SubStageAutoID = tSSt.SubStageAutoID LEFT OUTER JOIN
dbo.TableOfSubGenuss tSGe ON P.SubGenusAutoID = tSGe.SubGenusAutoID LEFT OUTER JOIN
dbo.TableOfSubSpeciess tSSp ON P.SubSpeciesAutoID = tSSp.SubSpeciesAutoID LEFT OUTER JOIN
dbo.TableOfSpeciess tSp ON P.SpeciesAutoID = tSp.SpeciesAutoID LEFT OUTER JOIN
dbo.TableOfOrders tOr ON P.OrderAutoID = tOr.OrderAutoID LEFT OUTER JOIN
dbo.TableOfGenuss tGe ON P.GenusAutoID = tGe.GenusAutoID LEFT OUTER JOIN
dbo.TableOfGroups tGr ON P.GroupAutoID = tGr.GroupAutoID LEFT OUTER JOIN
dbo.TableOfFamilys tF ON P.FamilyAutoID = tF.FamilyAutoID
WHEREP.GenusAutoID = 11111
ANDP.SpeciesAutoID = 22222The joins ALWAYS lists all the tables, because I need to display fields from these tables for whichever record I end up retrieving. Not always will every table have a corresponding entry in the main table; that's why I'm using a left join. I was experimenting with various ways to phrase those, trying to see if I could improve on the situation, but that seems to be pointless. However, the WHERE clause will vary quite a lot. And here is where I'm not clear - whether I should use 'P.GenusAutoID = 11111' or 'dbo.TableOfGenuss.GenusAutoID = 11111' in the WHERE clause. The query plans vary quite a lot, sometimes even when I simply change the value of the literal, which I don't understand at all why that would happen.
But I can't figure out from the query plans whether one way is provably better than the other - I simply don't understand what I'm looking at well enough, in spite of having read many articles on analyzing query plans. On one hand, it seems that limiting the aux tables might result in less data getting shoveled around unnecessarily. On the other hand, if all the comparisons are made against one (the main) table, ESPECIALLY when I include text fields from the main table which cannot be read from anywhere else, it seems that limiting the scan to one table might be faster than looking at many tables.
Is one way better than the other? A little? A lot? Not at all? Or is SQL Server by any chance smart enough to understand the relationship between the tables and do things its own way, regardless of how I code it? That would be ideal, and I wouldn't even be surprised if that information is in the query plan, for those with the wit to comprehend it. I'll create a few test examples and post the actual plans, if it would help, but they're pretty messy.
June 5, 2012 at 4:26 am
It has nothing whatsoever to do with performance. The meaning of the two queries is different. That's why you're getting different execution plans.
Simplified example:
CREATE TABLE #t1 (
ColourID INT IDENTITY PRIMARY KEY,
Colour varchar(20)
)
CREATE TABLE #t2 (
FruitID INT IDENTITY PRIMARY KEY,
Fruit varchar(20),
ColourID int
)
INSERT INTO #t1 (Colour)
VALUES ('Red'), ('Green'), ('Blue'), ('Yellow')
INSERT INTO #t2 (Fruit, ColourID)
VALUES ('Strawberry', 1), ('Blueberry',3), ('Banana',4)
SELECT * FROM #t1 LEFT OUTER JOIN #t2 ON #t1.ColourID = #t2.ColourID
WHERE #t1.ColourID = 2
SELECT * FROM #t1 LEFT OUTER JOIN #t2 ON #t1.ColourID = #t2.ColourID
WHERE #t2.ColourID = 2
DROP TABLE #t1
DROP TABLE #t2
Use the one that produces the correct output.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 5, 2012 at 4:53 am
GilaMonster (6/5/2012)
It has nothing whatsoever to do with performance. The meaning of the two queries is different. That's why you're getting different execution plans.
But Gail, they both produce EXACTLY the same results - that's why I'm writing this. If one produced incorrect results, I wouldn't be pestering the forum with the question at all. Different queries, different execution plans, identical results (both correct), in the same order, even. Here are the two queries, with the plans attached. the first has the WHERE condition directed to the main table alias (P), the second to the aux table aliases (tGe and tSp).
SELECTdbo.fnsFormatAkcesit(A.AkcesitPred, A.Akcesit, A.Rok) AS Akc,
dbo.fnsFormatDrEvid(P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra) AS DrEvid,
dbo.fnsTaxonomieFormatted(tGe.Genus, tSGe.SubGenus, tSp.Species, tSSp.SubSpecies, P.Authors, tF.Family,
tOr.[Order], tGr.[Group]) AS Taxonomie, dbo.fnsStratigrafie(tSy.System, tSe.Series,
tSt.Stage, tSSt.SubStage, tLU.LithographicUnit, tLSU.LithographicSubUnit,
tZ.Zone) AS Stratigrafie, CASE WHEN Depozitar IS NULL
THEN '' ELSE Depozitar END + CASE WHEN UlozisteDocasne IS NULL THEN '' ELSE ' {' + UlozisteDocasne + '}' END AS Lokace, P.PodrobnostiAutoID,
P.AkcesAutoID, P.EvidenceGroup, P.GroupAutoID, P.OrderAutoID, P.FamilyAutoID,
P.GenusAutoID, P.SubGenusAutoID, P.SpeciesAutoID, P.SubSpeciesAutoID,
P.SystemAutoID, P.SeriesAutoID, P.StageAutoID, P.SubStageAutoID,
P.LithographicUnitAutoID, P.LithographicSubUnitAutoID, P.ZoneAutoID, P.CountryAutoID,
P.DepozitarAutoID, P.UlozisteDocasne, P.PDAutoID, P.ODAutoID, P.OriginAutoID,
P.PocetKusu, P.OTHER_NO, P.Poznamka, P.Description, P.RockType,
P.Preserv, P.Original, P.Authors, P.Lokalita, P.IDNeDruheEvidence,
P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra, A.Rok, A.AkcesitPred, A.Akcesit,
A.KusuVKatalogu, P.NepublikovatYN, tT.Zkratka, P.TypAutoID, left(CONVERT(varchar(100), tNI.Datum, 21),10) As Datum, tNI.Clovek
FROM dbo.Podrobnosti P INNER JOIN
dbo.Akces A ON P.AkcesAutoID = A.AkcesAutoID LEFT OUTER JOIN
dbo.vwNejnovejsiInventarizace tNI ON P.PodrobnostiAutoID = tNI.PodrobnostiAutoID LEFT OUTER JOIN
dbo.TableOfTyps tT ON P.TypAutoID = tT.TypAutoID LEFT OUTER JOIN
dbo.TableOfSystems tSy ON P.SystemAutoID = tSy.SystemAutoID LEFT OUTER JOIN
dbo.vwTableOfDepozitarsFormatted tDF ON P.DepozitarAutoID = tDF.DepozitarAutoID LEFT OUTER JOIN
dbo.TableOfLithographicSubUnits tLSU ON P.LithographicSubUnitAutoID = tLSU.LithographicSubUnitAutoID LEFT OUTER JOIN
dbo.TableOfCountrys tC ON P.CountryAutoID = tC.CountryAutoID LEFT OUTER JOIN
dbo.TableOfLithographicUnits tLU ON P.LithographicUnitAutoID = tLU.LithographicUnitAutoID LEFT OUTER JOIN
dbo.TableOfSeriess tSe ON P.SeriesAutoID = tSe.SeriesAutoID LEFT OUTER JOIN
dbo.TableOfStages tSt ON P.StageAutoID = tSt.StageAutoID LEFT OUTER JOIN
dbo.TableOfZones tZ ON P.ZoneAutoID = tZ.ZoneAutoID LEFT OUTER JOIN
dbo.TableOfSubStages tSSt ON P.SubStageAutoID = tSSt.SubStageAutoID LEFT OUTER JOIN
dbo.TableOfSubGenuss tSGe ON P.SubGenusAutoID = tSGe.SubGenusAutoID LEFT OUTER JOIN
dbo.TableOfSubSpeciess tSSp ON P.SubSpeciesAutoID = tSSp.SubSpeciesAutoID LEFT OUTER JOIN
dbo.TableOfSpeciess tSp ON P.SpeciesAutoID = tSp.SpeciesAutoID LEFT OUTER JOIN
dbo.TableOfOrders tOr ON P.OrderAutoID = tOr.OrderAutoID LEFT OUTER JOIN
dbo.TableOfGenuss tGe ON P.GenusAutoID = tGe.GenusAutoID LEFT OUTER JOIN
dbo.TableOfGroups tGr ON P.GroupAutoID = tGr.GroupAutoID LEFT OUTER JOIN
dbo.TableOfFamilys tF ON P.FamilyAutoID = tF.FamilyAutoID
WHEREP.GenusAutoID = 37263
ANDP.SpeciesAutoID = 51455
SELECTdbo.fnsFormatAkcesit(A.AkcesitPred, A.Akcesit, A.Rok) AS Akc,
dbo.fnsFormatDrEvid(P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra) AS DrEvid,
dbo.fnsTaxonomieFormatted(tGe.Genus, tSGe.SubGenus, tSp.Species, tSSp.SubSpecies, P.Authors, tF.Family,
tOr.[Order], tGr.[Group]) AS Taxonomie, dbo.fnsStratigrafie(tSy.System, tSe.Series,
tSt.Stage, tSSt.SubStage, tLU.LithographicUnit, tLSU.LithographicSubUnit,
tZ.Zone) AS Stratigrafie, CASE WHEN Depozitar IS NULL
THEN '' ELSE Depozitar END + CASE WHEN UlozisteDocasne IS NULL THEN '' ELSE ' {' + UlozisteDocasne + '}' END AS Lokace, P.PodrobnostiAutoID,
P.AkcesAutoID, P.EvidenceGroup, P.GroupAutoID, P.OrderAutoID, P.FamilyAutoID,
P.GenusAutoID, P.SubGenusAutoID, P.SpeciesAutoID, P.SubSpeciesAutoID,
P.SystemAutoID, P.SeriesAutoID, P.StageAutoID, P.SubStageAutoID,
P.LithographicUnitAutoID, P.LithographicSubUnitAutoID, P.ZoneAutoID, P.CountryAutoID,
P.DepozitarAutoID, P.UlozisteDocasne, P.PDAutoID, P.ODAutoID, P.OriginAutoID,
P.PocetKusu, P.OTHER_NO, P.Poznamka, P.Description, P.RockType,
P.Preserv, P.Original, P.Authors, P.Lokalita, P.IDNeDruheEvidence,
P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra, A.Rok, A.AkcesitPred, A.Akcesit,
A.KusuVKatalogu, P.NepublikovatYN, tT.Zkratka, P.TypAutoID, left(CONVERT(varchar(100), tNI.Datum, 21),10) As Datum, tNI.Clovek
FROM dbo.Podrobnosti P INNER JOIN
dbo.Akces A ON P.AkcesAutoID = A.AkcesAutoID LEFT OUTER JOIN
dbo.vwNejnovejsiInventarizace tNI ON P.PodrobnostiAutoID = tNI.PodrobnostiAutoID LEFT OUTER JOIN
dbo.TableOfTyps tT ON P.TypAutoID = tT.TypAutoID LEFT OUTER JOIN
dbo.TableOfSystems tSy ON P.SystemAutoID = tSy.SystemAutoID LEFT OUTER JOIN
dbo.vwTableOfDepozitarsFormatted tDF ON P.DepozitarAutoID = tDF.DepozitarAutoID LEFT OUTER JOIN
dbo.TableOfLithographicSubUnits tLSU ON P.LithographicSubUnitAutoID = tLSU.LithographicSubUnitAutoID LEFT OUTER JOIN
dbo.TableOfCountrys tC ON P.CountryAutoID = tC.CountryAutoID LEFT OUTER JOIN
dbo.TableOfLithographicUnits tLU ON P.LithographicUnitAutoID = tLU.LithographicUnitAutoID LEFT OUTER JOIN
dbo.TableOfSeriess tSe ON P.SeriesAutoID = tSe.SeriesAutoID LEFT OUTER JOIN
dbo.TableOfStages tSt ON P.StageAutoID = tSt.StageAutoID LEFT OUTER JOIN
dbo.TableOfZones tZ ON P.ZoneAutoID = tZ.ZoneAutoID LEFT OUTER JOIN
dbo.TableOfSubStages tSSt ON P.SubStageAutoID = tSSt.SubStageAutoID LEFT OUTER JOIN
dbo.TableOfSubGenuss tSGe ON P.SubGenusAutoID = tSGe.SubGenusAutoID LEFT OUTER JOIN
dbo.TableOfSubSpeciess tSSp ON P.SubSpeciesAutoID = tSSp.SubSpeciesAutoID LEFT OUTER JOIN
dbo.TableOfSpeciess tSp ON P.SpeciesAutoID = tSp.SpeciesAutoID LEFT OUTER JOIN
dbo.TableOfOrders tOr ON P.OrderAutoID = tOr.OrderAutoID LEFT OUTER JOIN
dbo.TableOfGenuss tGe ON P.GenusAutoID = tGe.GenusAutoID LEFT OUTER JOIN
dbo.TableOfGroups tGr ON P.GroupAutoID = tGr.GroupAutoID LEFT OUTER JOIN
dbo.TableOfFamilys tF ON P.FamilyAutoID = tF.FamilyAutoID
WHEREtGe.GenusAutoID = 37263
ANDtSp.SpeciesAutoID = 51455
June 5, 2012 at 5:18 am
pdanes (6/5/2012)
GilaMonster (6/5/2012)
It has nothing whatsoever to do with performance. The meaning of the two queries is different. That's why you're getting different execution plans.But Gail, they both produce EXACTLY the same results - that's why I'm writing this.
With your particular set of data at this particular point in time. That doesn't change the fact that they are different queries and absolutely can return different data. See the example I gave.
If you filter on the right-hand table of a left join, then you have converted the join to an inner join (as I've said before). If that's what you need (inner joins), that's fine. If it's not, then you either need to filter the left-hand table of do a filter on the join (as in the example I posted yesterday) or filter WHERE <column> = <value> OR <column> is NULL if you need to preserve the left join characteristics.
Again, this has nothing whatsoever to do with performance, you have two different queries there that are perfectly capable of returning different results. The first one is using left joins, the second, because of the filters, is actually going to do inner joins to the TableOfSubGenuss and TableOfSpeciess tables. The fact that they are returning the same data implies that all the rows in Podrobnosti have matching rows in those two tables, so you can just change the left join to inner join, then the two queries will be semantically and logically identical.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 8, 2012 at 3:23 am
GilaMonster (6/5/2012)
Hi Gail, sorry for the delay in responding, I had some fires to put out, and I wanted think about what I needed, and construct an example to help illustrate it.
With your particular set of data at this particular point in time. That doesn't change the fact that they are different queries and absolutely can return different data. See the example I gave.
I did study it and tried it out - naturally, it performed as you say and as I would expect. Your condition is for something that doesn't exist in the main (T2) table, 2, corresponding to 'Green'. When I change it to 3, both results are the same. Both these behaviors mirror what I get in my real data. However, here you're joining AND selecting on the same criteria. I want to ALWAYS join, and select based on user input.
If you filter on the right-hand table of a left join, then you have converted the join to an inner join (as I've said before). If that's what you need (inner joins), that's fine.
I need a left join when I do not specify a condition for that table, since I want any other data that may be in that table. If I do specify a condition, I want the equivalent of an inner join. That's how I initially coded that app that builds this query, with inner joins on tables where I specified a condition and left joins on tables that do not have a condition. That's how this whole thread got started. That variation of JOIN syntax turned out to be pointless, as you mention - it winds up being the equivalent of an INNER JOIN.
Here is some sample code that may help illustrate what I'm doing:
CREATE TABLE #m1 (FruitID INT IDENTITY PRIMARY KEY, Fruit varchar(20), ColorID int, ShapeID int, TasteID int, PriceID int, FreshID int)
CREATE TABLE #a1 (ColorID INT IDENTITY PRIMARY KEY, Color varchar(20))
CREATE TABLE #a2 (ShapeID INT IDENTITY PRIMARY KEY, Shape varchar(20))
CREATE TABLE #a3 (TasteID INT IDENTITY PRIMARY KEY, Taste varchar(20))
CREATE TABLE #a4 (PriceID INT IDENTITY PRIMARY KEY, Price varchar(20))
CREATE TABLE #a5 (FreshID INT IDENTITY PRIMARY KEY, Fresh varchar(20))
INSERT INTO #m1 (Fruit, ColorID, ShapeID, TasteID, PriceID, FreshID)
VALUES
('Strawberry', 1, 1, 1, 1, 1),
('Cucumber', 2, 2, 2, 2, 2),
('Blueberry', 3, 3, 3, 3, 3),
('Banana', 4, 4, 4, 4, 4),
('Cayenne', 1, 2, 5, Null, 4),
('Tomato', 1, 3, 3, 4, 5),
('Broccoli', 2, Null, Null, 3, 1),
('Spinach', 2, Null, 1, 4, 3)
INSERT INTO #a1(Color) VALUES ('Red'), ('Green'), ('Blue'), ('Yellow')
INSERT INTO #a2(Shape) VALUES ('Heart'), ('Long'), ('Round'), ('Curved')
INSERT INTO #a3(Taste) VALUES ('Sweet'), ('Sour'), ('Tangy'), ('Creamy'), ('Hot')
INSERT INTO #a4(Price) VALUES ('High'), ('Low'), ('Medium'), ('Cheap')
INSERT INTO #a5(Fresh) VALUES ('Yes'), ('Pickled'), ('Frozen'), ('Stale'), ('Canned')
-- Example 1
SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1
LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID
LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID
LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID
LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID
LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID
WHERE #m1.FreshID = 4
SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1
LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID
LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID
LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID
LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID
LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID
WHERE #a5.FreshID = 4
-- Example 2
SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1
LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID
LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID
LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID
LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID
LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID
WHERE #m1.ColorID = 2
SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1
LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID
LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID
LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID
LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID
LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID
WHERE #a1.ColorID = 2
-- Example 3
SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1
LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID
LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID
LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID
LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID
LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID
WHERE #m1.ShapeID = 2
SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1
LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID
LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID
LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID
LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID
LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID
WHERE #a2.ShapeID = 2
DROP TABLE #m1
DROP TABLE #a1
DROP TABLE #a2
DROP TABLE #a3
DROP TABLE #a4
DROP TABLE #a5
See that I'm always joining ALL my aux tables, but filtering on only certain conditions. When I put in the two variations of the WHERE clause, filtering on either the main table or the appropriate aux table, I get identical results, but different query plans.
Does this make it any clearer, or am I still completely off target?
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply