Sub query expressions vs. Joins.

Question

Sub query expressions vs. Joins.

Viewing 12 posts - 46 through 56 (of 56 total)

You must be logged in to reply to this topic. Login to reply

spaghettidba SSC Guru Points: 105732 More actions · Answer 1

Jeff Moden (7/22/2010)
Maybe that's true in Oracle but it's not true in SQL Server. Oracle is mostly rule based and SQL Server is cost based.

I'm glad you commented this article, Jeff. Great discussion.

A minor note on the Oracle thing: it used to be completely rule-based, but now it's entirely cost-based. Starting from version 10g the rule-based optimizer is no longer available.

-- Gianluca Sartori

Jeff Moden SSC Guru Points: 1004340 More actions · Answer 2

cmapowers (7/25/2010)
This seems to be a correlated sub-query. Does this apply to non correlated sub-query?

All non correlated sub-queries break down to being an inline view whether they're in the form of a derived table in the FROM clause or a CTE reference in the FROM clause) and are typically joined as if a table. So the link to Gail's blog still applies.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004340 More actions · Answer 3

Gianluca Sartori (7/26/2010)
A minor note on the Oracle thing: it used to be completely rule-based, but now it's entirely cost-based. Starting from version 10g the rule-based optimizer is no longer available.

Thanks for the update on Oracle, Gianluca. It HAS been a while since I've had to work with it at such a level.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

spaghettidba SSC Guru Points: 105732 More actions · Answer 4

Jeff Moden (7/26/2010)
Gianluca Sartori (7/26/2010)
A minor note on the Oracle thing: it used to be completely rule-based, but now it's entirely cost-based. Starting from version 10g the rule-based optimizer is no longer available.
Thanks for the update on Oracle, Gianluca. It HAS been a while since I've had to work with it at such a level.

:crying: You don't know how lucky you are...

-- Gianluca Sartori

Jeff Moden SSC Guru Points: 1004340 More actions · Answer 5

Gianluca Sartori (7/26/2010)
Jeff Moden (7/26/2010)
Gianluca Sartori (7/26/2010)
A minor note on the Oracle thing: it used to be completely rule-based, but now it's entirely cost-based. Starting from version 10g the rule-based optimizer is no longer available.
Thanks for the update on Oracle, Gianluca. It HAS been a while since I've had to work with it at such a level.
:crying: You don't know how lucky you are...

Heh... actually, I do. Better yet, one of the things I get to do at my "new" job is migrate some Oracle databases and related code to SQL Server. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

cmapowers SSC Veteran Points: 209 More actions · Answer 6

cmapowers

SSC Veteran

Points: 209

July 26, 2010 at 6:18 pm

#1198502

Thank you.

IG Grasshopper Points: 13 More actions · Answer 7

Why not the following?

SELECT a.EmployeeID,a.Salary a.RActive

b.ManagerName

FROM Employees a (nolock)

,Managers b (nolock)

WHERE b.ManagerID = 1

AND a.ManagerID = b.ManagerID;

spaghettidba SSC Guru Points: 105732 More actions · Answer 8

IG (1/20/2012)
Why not the following?
SELECT a.EmployeeID,a.Salary a.RActive
b.ManagerName
FROM Employees a (nolock)
,Managers b (nolock)
WHERE b.ManagerID = 1
AND a.ManagerID = b.ManagerID;

NOLOCK can return inconsistent data (dirty reads, non repeatable reads and the like), but can also return duplicate data or no data at all for some rows. NOLOCK means that data is not locked at all, even from the query itself, that could end up reading the same row twice in case of a page split.

If you are ok with inconsistent data, then go on and use NOLOCK, just make sure your users are ok with inconsistent data as well.

-- Gianluca Sartori

SQLBoar SSC Veteran Points: 278 More actions · Answer 9

I pity any already confused developer who reads this article. As, despite the title, it isn't about sub-queries. What the author refers to are called Derived Tables, Table Expressions (which is where Common Table Expressions get their name from) or sometimes (almost equally confusingly) Inline Views. Any clues - the execution plan has no SEMI-JOIN in it. Refer to Chapter 6 "Subqueries, Expressions and Ranking Functions" of the IBG SS2008 T-SQL Querying book for reference.

Sorry to be negative, but one of Paul Randall's top myths busted is "sub-queries are always slower". And, this kind of confusing article just adds to the myth.

But, if it was titled Table Expressions vs Joins :exclamation:

This DBA says - "It depends".

Mad Myche SSCommitted Points: 1593 More actions · Answer 10

I can honestly say that I have never seen subqueries used as such, these queries have always been of the inner join variety. And judging by the discussion from months ago, this will be something I remember in the future.

Anyways, I have seen a variation of the "standard" inner join

SELECT EmployeeID, Salary, RActive, ManagerName

FROM Employees

INNER JOIN Managers

ON Manangers.ManagersID = Employees.ManagerID

WHERE Employees.ManagerID = 3

In which the WHERE is eliminated and is replaced with using an AND on the JOIN portion

SELECT EmployeeID, Salary, RActive, ManagerName

FROM Employees

INNER JOIN Managers

ON Manangers.ManagersID = Employees.ManagerID

AND Managers.ManagerID = 3

On larger datasets I have seen better performance with the second version, but on smaller I see no performance advantage.

I did do a test on a Products:Categories relationship on one of the DBs I have mounted, where there most populated category had about 50 items. I used the subset and the 2 variations of the join on both an '08 box and a 2K instance.

The results were fairly equal, with the subset at 34% and the joins at 33% apiece. This was consistent between both server instances

Director of Transmogrification Services

jcb SSCertifiable Points: 6574 More actions · Answer 11

After some years of observation I found if 2 queries are similar the optimizer ill process both in a similar way.

It means its likely there's no difference in performance for most joins x sub-queries scenarios.

And also it's mean

If you found differente query plans it's likely the queries are different and that's the case here.

Put yourself in the compiler place.

You can read the second query as:

fetch manager-1 (a single clustered index seek)

now fetch me all employers managed by manager-1 (40k index seek)

First query reads as:

fetch me all employers managed by manager-1 (40k index seek)--same cost as the other way

and for each fetched employer seek if there's a manager for him and fetch it as well.

Since its not a strict inner join its possible there are rows where the id is null.

The compiler still try (futile) to optimize by using a lazy spool.

I cannot test by now but I guess a FK can improve the subquery.

UKGav Say Hey Kid Points: 678 More actions · Answer 12

I had to use subqueries for 1-to-many tables where I needed to only return only 1 row for each record (Select top 1 date from InnerTable joined to OuterTable order by date desc) as [LastDate], thereby the results become 1-to-1.

This was noticable slow; but I don't believe there is an alternative (*??*) with JOINs without either missing out records from the primary table where the is no join or returning too many rows (LEFT JOIN) with a lot of null values from forgein table.

Might CTE's be the answer instead? I keep meaning to learn about them..

- Gav.