October 25, 2012 at 12:38 pm
I'm writing a script that will probably join a bit more then 20 tables.
It's my understanding that joins are costly.
Would it cost less to write certain parts like my sample2 below?
I'm only returning that 1 field from Table2.
All ID fields are unique. I read about this a while back but can't remember anything about it.
Sample1
select t2.MyName PersonName
, t1.*
from Table1 t1
inner join Table2 t2
on t1.SalesPersonID = t2.ID
where t1.ID= 12345
Sample2
select (select t2.MyName from Table2 t2 where t2.ID = t1.SalesPersonID) PersonName
, t1.*
from Table1 t1
where frqr.tril_gid = 12345
Thanks a bunch.
---------------------------------------------------------------
Mike Hahn - MCSomething someday:-)
Right way to ask for help!!
http://www.sqlservercentral.com/articles/Best+Practices/61537/
I post so I can see my avatar :hehe:
I want a personal webpage 😎
I want to win the lotto 😀
I want a gf like Tiffa :w00t: Oh wait I'm married!:-D
October 25, 2012 at 12:42 pm
My $.02.
If your tables are modeled and indexed correctly then joins are a necessary evil.
Your sample 2 uses a correlated subquery, and I believe this is more costly than a join.
October 25, 2012 at 12:44 pm
Want a cool sig (10/25/2012)
I'm writing a script that will probably join a bit more then 20 tables.It's my understanding that joins are costly.
Would it cost less to write certain parts like my sample2 below?
I'm only returning that 1 field from Table2.
All ID fields are unique. I read about this a while back but can't remember anything about it.
Sample1
select t2.MyName PersonName
, t1.*
from Table1 t1
inner join Table2 t2
on t1.SalesPersonID = t2.ID
where t1.ID= 12345
Sample2
select (select t2.MyName from Table2 t2 where t2.ID = t1.SalesPersonID) PersonName
, t1.*
from Table1 t1
where frqr.tril_gid = 12345
Thanks a bunch.
its worth testing but in my experience joins are faster than correlated sub queries (however not all correlated sub queries can be rewritten as joins). reason being the sub query has to be run for every row instead of being able to use a join to get the same results in one pass. in our production code we try to eliminate sub queries because of this added cost. if we can join we join. if its a big long complicated query we test both ways in our dev environment and use which ever is faster. but the key there is we test test and test again.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
October 25, 2012 at 12:54 pm
Thanks a bunch. Correlated sub queries. Forgot what it was called. Forgot what was said about it. Been forgetting a lot of things. Must be getting old. :doze:
The DB is not designed properly. The app is OO VB6 that allowed the users to create DB objects (table/column) etc from the app interface (pretty much any user can add an object--column/table).
Basically forcing an OO model on a Relational DB. It's a nightmare. 200+ columns when only 25 related objects exists. Not to mention that the same data can potentially be written across 30+ different tables. Makes reporting fun. 🙂
---------------------------------------------------------------
Mike Hahn - MCSomething someday:-)
Right way to ask for help!!
http://www.sqlservercentral.com/articles/Best+Practices/61537/
I post so I can see my avatar :hehe:
I want a personal webpage 😎
I want to win the lotto 😀
I want a gf like Tiffa :w00t: Oh wait I'm married!:-D
October 25, 2012 at 1:53 pm
Correlated subqueries do not run per row of the outer query (unless triangular joins or top 1 ... order by).
Correlated subqueries are not usually more expensive than joins, they usually get compiled to exactly the same query plan.
Joins are not expensive operations to be avoided
Unless joins have been tested and are unacceptably slow, a more complex alternative should not be considered.
Write queries the simplest way, test performance, if unacceptable then consider complex rewrites
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 25, 2012 at 4:43 pm
GilaMonster (10/25/2012)
Correlated subqueries do not run per row of the outer query (unless triangular joins or top 1 ... order by).Correlated subqueries are not usually more expensive than joins, they usually get compiled to exactly the same query plan.
Joins are not expensive operations to be avoided
Unless joins have been tested and are unacceptably slow, a more complex alternative should not be considered.
Write queries the simplest way, test performance, if unacceptable then consider complex rewrites
looks like im off to hit the books on correlated subqueries now. then go yell at the guy i heard it from.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
October 26, 2012 at 3:24 am
looks like im off to hit the books on correlated subqueries now. then go yell at the guy i heard it from.
+1 to that...
Could've swore I've read in the official MCTS exam book that correlated subqueries are performed on a row by row basis if used in the SELECT, and perform exactly like a join if used in the EXISTS part of a WHERE clause.
Update: Iv actually looked it up and tested it, the book wrongly says:
A correlated subquery is typically equal in performance compared to using a JOIN when used in combination with the EXISTS operator to filter the outer query... While this type of correlated subquery is typically a good implementation, the use of correlated subqueries in the SELECT clause often has a negative effect on performance compared to JOINs. Of course, this depends on the number of rows returned by the outer query. If a large number of rows are returned, each query in the SELECT clause would be executed for each row
Damn, can't even trust books anymore. Thanks for clearing things up Gail
-----------------
... Then again, I could be totally wrong! Check the answer.
Check out posting guidelines here for faster more precise answers[/url].
I believe in Codd
... and Thinknook is my Chamber of Understanding
October 26, 2012 at 5:03 am
That's doubly wrong, because EXISTS with correlated subqueries does not perform equally with joins, it's usually very, very slightly better (and actually they're not even equivalent query forms)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 26, 2012 at 5:15 am
GilaMonster (10/26/2012)
That's doubly wrong, because EXISTS with correlated subqueries does not perform equally with joins, it's usually very, very slightly better (and actually they're not even equivalent query forms)
Agree with Gail on the fact that EXISTS with correlated subqueries is not equivalent to JOIN's. However in terms of performance - it depends. From my experience I wouldn't say that usually one is better than another. I've seen all three permutations of EXISTS vs OUTER JOIN's .
I do usually test both where appropriate.
October 26, 2012 at 9:01 am
GilaMonster (10/26/2012)
That's doubly wrong, because EXISTS with correlated subqueries does not perform equally with joins, it's usually very, very slightly better (and actually they're not even equivalent query forms)
Good learning for me.. glad i spent one hour after my hectic office hours at Home ..thanks gail :-). can you please put some light on its explanation
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
October 26, 2012 at 9:36 am
GilaMonster (10/25/2012)
Correlated subqueries do not run per row of the outer query (unless triangular joins or top 1 ... order by).
One more case when optimizer is not able to decorrelate subquery is an assert impact.
use tempdb;
go
create table t1 (a int);
create table t2 (a int);
go
insert t1 values (1),(2),(3),(4),(5);
insert t2 values (1),(2),(3),(4),(5);
go
set statistics profile on
go
select
a,
other_a = (select a from t2 where t1.a = t2.a)
from
t1
;
go
set statistics profile off
go
drop table t1,t2;
51select a, other_a = (select a from t2 where t1.a = t2.a) from t1 ;
00 |--Compute Scalar(DEFINE:([Expr1009]=[Expr1011]))
51 |--Nested Loops(Left Outer Join, OUTER REFERENCES:([tempdb].[dbo].[t1].[a]))
51 |--Table Scan(OBJECT:([tempdb].[dbo].[t1]))
55 |--Assert(WHERE:(CASE WHEN [Expr1010]>(1) THEN (0) ELSE NULL END))
55 |--Stream Aggregate(DEFINE:([Expr1010]=Count(*), [Expr1011]=ANY([tempdb].[dbo].[t2].[a])))
55 |--Table Scan(OBJECT:([tempdb].[dbo].[t2]), WHERE:([tempdb].[dbo].[t1].[a]=[tempdb].[dbo].[t2].[a]))
upd. fixed formatting.
October 26, 2012 at 9:40 am
It still showing sad smiles after fixing and putting plan text into |code| tag instead of |quote|.
Yes, that is sad query =))
October 26, 2012 at 10:15 am
One more case when optimizer is not able to decorrelate subquery is an assert impact.
Interesting! so what is a scenario when the optimizer compiles both a correlated subquery query and a join query to the same query plan?, Am trying to come up with one but not managing, even if the query plan looks "almost" the same, SQL is using different join operators (for example Nested Loop for correlated subquery vs Hash Match for join).
/>L
-----------------
... Then again, I could be totally wrong! Check the answer.
Check out posting guidelines here for faster more precise answers[/url].
I believe in Codd
... and Thinknook is my Chamber of Understanding
October 26, 2012 at 10:40 am
SELECT sod.SalesOrderID ,
SalesOrderDetailID ,
AccountNumber = (SELECT AccountNumber FROM Sales.SalesOrderHeader soh WHERE soh.SalesOrderID = sod.SalesOrderID)
FROM Sales.SalesOrderDetail AS sod
SELECT sod.SalesOrderID ,
SalesOrderDetailID ,
AccountNumber
FROM Sales.SalesOrderDetail AS sod LEFT OUTER JOIN Sales.SalesOrderHeader soh ON soh.SalesOrderID = sod.SalesOrderID
Near-identical plans (an extra compute scalar in the first)
Or take SomewhereSomehow's example and add a unique constraint (or primary key) to the second table. If SQL knows the column is unique, it doesn't need an assert and hence doesn't need the nested loop join but can use other joins as it likes.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply