Performance where <> in <> inner join

Question

Post reply

Performance where <> in <> inner join

MDave

SSC Eights!

Points: 953
More actions
December 9, 2010 at 4:30 am

#97610

Hello all,
i have a query like this which is fast.
select * FROM Table a
where
a.Country = 151
and a.Date = 20101126
But when i use an in statement or an inner join to a small view which shows only one row including these 2 columns the query takes over 3 times longer.
Any idea why or how i can improve this?
------- IN Statement
select * FROM Table a
where
a.Country in
(select Country from V_LOAD_COMBINATION and
a.Date in
(select Date from V_LOAD_COMBINATION )
------- INNER JOIN
select * FROM Table a
inner join V_LOAD_COMBINATION b
on
a.Country= b.Country and
a.Date = b.Date

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Lowell SSC Guru Points: 323518 More actions · Answer 1

i believe the IN statements can bail out as soon as a single value is found in V_LOAD_COMBINATION ;

when you join the tables, the join would perform better if there was an index on (Country,Date),

otherwise it uses table scan most likely to dig up the values.

add an index, and actually look at the execution plan before and after, i think you'll get a better idea of what's going on.

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

VSSGeorge SSCrazy Eights Points: 8151 More actions · Answer 2

Hi

Go for the Inner Join instead of IN clause, certainly the performance will benefit.

Bharat Panthee Default port Points: 1448 More actions · Answer 3

You can use where exist i think that will be faster

select * from Table a where exists(select 1 from V_LOAD_COMBINATION where V_LOAD_COMBINATION.Date = a.Date and V_LOAD_COMBINATION.Country = a.Country )

harinerella Mr or Mrs. 500 Points: 548 More actions · Answer 4

select * FROM Table a

where

a.Country in

(select Country from V_LOAD_COMBINATION and

a.Date in

(select Date from V_LOAD_COMBINATION )

------- INNER JOIN

select * FROM Table a

inner join V_LOAD_COMBINATION b

on

a.Country= b.Country and

a.Date = b.Date

If you closely see both queries will return different results. Please check your need.

As far as performance is concerned its better to use inner join with an index on country, and date columns of table a. If the related columns were also part an index in the view will add more performance

Latheesh NK SSCertifiable Points: 6559 More actions · Answer 5

As per my experience , the answer is depends. Please do all the options like Exists, in and join and find out the best plan suits your query.

Abhijit More SSCertifiable Points: 6897 More actions · Answer 6

To answer your question...

JOINS always perform better than IN clause. Why ? IN clause is basically performs sub query funcitonality, for each row in outer query it execute the innser query. This makes the query to perform badly. Where as in JOINs it retrives only those matching recrods based on the argument.

Here in your example the simple query with equal to argument search for exact match. In clause query is qorking as explain above. and the 3rd query using view and table JOIN is retriving the records based on matching criteria. i.e. JOIN. Also view is just and representation of Query, when you use View internally it executes the query inside the view except in case of Indexed Views.

I hope this makes clear to your question. Let us know if need more clarification...

Abhijit - http://abhijitmore.wordpress.com

grahamc SSCertifiable Points: 6762 More actions · Answer 7

Could depend on the view statement as well... You its small, but is the SQL complex??

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

Abhijit More (12/15/2010)
To answer your question...
JOINS always perform better than IN clause. Why ? IN clause is basically performs sub query funcitonality, for each row in outer query it execute the innser query. This makes the query to perform badly. Where as in JOINs it retrives only those matching recrods based on the argument.
Here in your example the simple query with equal to argument search for exact match. In clause query is qorking as explain above. and the 3rd query using view and table JOIN is retriving the records based on matching criteria. i.e. JOIN. Also view is just and representation of Query, when you use View internally it executes the query inside the view except in case of Indexed Views.
I hope this makes clear to your question. Let us know if need more clarification...

Careful...here's a couple of sample queries modified from a similar thread:

DROP TABLE #po

DROP TABLE #ag

CREATE TABLE #ag (agreementid INT IDENTITY (1,1), agreementnumber CHAR(3))

CREATE TABLE #po (poid INT, ponumber VARCHAR(5), agreementid INT, potype CHAR(2), usagedate DATETIME)

insert into #po(poid, ponumber,agreementid,potype,usagedate) values(1, 'po1',2,'NE',dateadd(mm,-5,getdate()))

insert into #po(poid, ponumber,agreementid,potype,usagedate) values(2, 'po2',2,'NE',dateadd(mm,-5,getdate()))

insert into #po(poid, ponumber,agreementid,potype,usagedate)values(3, 'zupo1',2,'ZU',dateadd(mm,-5,getdate()))

insert into #po(poid, ponumber,agreementid,potype,usagedate) values(4, 'zu3',1,'ZU',dateadd(mm,-5,getdate()))

insert into #po(poid, ponumber,agreementid,potype,usagedate) values(5, 'zu4',1,'ZU',dateadd(mm,-5,getdate()))

insert into #po(poid, ponumber,agreementid,potype,usagedate)values(NULL,'zu5',1,'ZU',dateadd(mm,-5,getdate()))

insert into #ag(agreementnumber) values('Ag1')

insert into #ag(agreementnumber) values('Ag2')

insert into #ag(agreementnumber) values('Ag3')

insert into #ag(agreementnumber) values('Ag4')

insert into #ag(agreementnumber) values('Ag5')

select po.agreementid, po.ponumber

from #po po

INNER join #ag ag on po.agreementid = ag.agreementid

select po.agreementid, po.ponumber

from #po po

WHERE po.agreementid IN (SELECT agreementid FROM #ag)

Run the two queries together in a batch using SQL2K8 and query 1 with the INNER JOIN has a cost of 78% (relative to the batch of the two queries). Now look at the plans - the operator in query 1 is a hash match IJ as you'd expect, costing 73%. In query 2 it's a nested loops left semi join, costing 1%.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

VSSGeorge SSCrazy Eights Points: 8151 More actions · Answer 9

Hi

This link may be useful...

http://sqlinthewild.co.za/index.php/2010/01/12/in-vs-inner-join/

Arjun Sivadasan SSCertifiable Points: 5902 More actions · Answer 10

Lowell (12/9/2010)
add an index, and actually look at the execution plan before and after, i think you'll get a better idea of what's going on.

David, as Lowell has mentioned, add a composite non-clustered index on country, date.

This alone, will not speed up the query. Do you really need to 'select *'? Even though a NC index (covering index) is present, as you are selecting all columns, sql server will do lookups to read the rest of the columns. The lookups will be done one row at a time. And the lookups will be done only if the total number of rows in the result set is less than say 2% of the rows in the table. Otherwise, sql server goes for a table scan as it will be more efficient.

In short, add a composite index. Select only the columns you need. Check the execution plan to ascertain if table scan or index scan happens. Consider adding included columns.

- arjun

https://sqlroadie.com/