July 23, 2014 at 6:22 am
Hi all,
I have written a query in three different ways using IN, JOIN and EXISTS. As per my knowledge, Exists or Join should have performed better than IN, but in attached example all three queries are performing equally.
Is it a case specific behavior or generally all three performs evenly?
Inner queries are returning 8 and 7 records resp. Total record count is 18258
1. IN
SELECT wkcf.value as person_id
FROM
dbo.worker_custom_field wkcf (nolock)
INNER JOIN
dbo.worker wk (nolock)
ON wkcf.worker_id = wk.worker_id
WHERE wk.buyer_code IN
(SELECT child_company_code FROM dbo.company_link cpl (nolock) WHERE cpl.company_code = 'BACC')
AND wkcf.custom_field_id IN
(SELECT custom_field_id FROM dbo.bacc_person_id_custom_field_temp(nolock))
AND wkcf.value != '' GROUP BY wkcf.value HAVING Count (wk.worker_id) > 1
2. JOIN
SELECT wkcf.value as person_id
FROM dbo.worker wk (nolock)
INNER JOIN dbo.worker_custom_field wkcf (nolock) ON wkcf.worker_id = wk.worker_id
inner join dbo.company_link cpl (nolock) on wk.buyer_code = cpl.child_company_code and cpl.company_code = 'BACC'
inner join dbo.bacc_person_id_custom_field_temp bpicf(nolock) on wkcf.custom_field_id = bpicf.custom_field_id
AND wkcf.value != ''
GROUP BY wkcf.value HAVING Count (wk.worker_id) > 1
3. EXISTS
SELECT wkcf.value as person_id
FROM dbo.worker wk (nolock)
INNER JOIN dbo.worker_custom_field wkcf (nolock) ON wkcf.worker_id = wk.worker_id
WHERE exists
(SELECT child_company_code FROM dbo.company_link cpl (nolock) WHERE cpl.company_code = 'BACC'
and wk.buyer_code = cpl.child_company_code)
AND exists
(SELECT custom_field_id FROM dbo.bacc_person_id_custom_field_temp(nolock) where
wkcf.custom_field_id = custom_field_id
)
AND wkcf.value != '' GROUP BY wkcf.value HAVING Count (wk.worker_id) > 1
Regards.
July 23, 2014 at 6:31 am
http://sqlinthewild.co.za/index.php/2010/04/27/in-exists-and-join-a-roundup/[/url]
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 23, 2014 at 8:53 am
Seems like you have some stale statistics on worker_custom_field. Also, why are you using all those NOLOCK hints? Are you aware that you might get incorrect results that include duplicate or missing rows? You could check the following links for more information on this issue:
http://blogs.msdn.com/b/davidlean/archive/2009/04/06/sql-server-nolock-hint-other-poor-ideas.aspx
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/11/10/1280.aspx
http://www.jasonstrate.com/2012/06/the-side-effect-of-nolock/
IN and EXISTS tend to generate a semi-join and that makes them faster than a join. I'm not sure why you are getting normal joins in your plans.
Can you post your test harness? Did you clean the procedure cache before testing?
July 23, 2014 at 9:31 am
If you can afford this (means you're not on production environment)
dbcc dropcleanbuffers
dbcc freeproccache
dbcc freesessioncache
set statistics io on
-- One of your query here
set statistics io off
the you can see whether there is really no difference.
Igor Micev,My blog: www.igormicev.com
July 23, 2014 at 11:48 am
ChrisM@Work (7/23/2014)
http://sqlinthewild.co.za/index.php/2010/04/27/in-exists-and-join-a-roundup/[/url]
And the pertinent quote from there is 'very slightly'. The differences between join and in/exists were very slight.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply