December 9, 2010 at 4:30 am
Hello all,
i have a query like this which is fast.
select * FROM Table a
where
a.Country = 151
and a.Date = 20101126
But when i use an in statement or an inner join to a small view which shows only one row including these 2 columns the query takes over 3 times longer.
Any idea why or how i can improve this?
------- IN Statement
select * FROM Table a
where
a.Country in
(select Country from V_LOAD_COMBINATION and
a.Date in
(select Date from V_LOAD_COMBINATION )
------- INNER JOIN
select * FROM Table a
inner join V_LOAD_COMBINATION b
on
a.Country= b.Country and
a.Date = b.Date
December 9, 2010 at 6:31 am
i believe the IN statements can bail out as soon as a single value is found in V_LOAD_COMBINATION ;
when you join the tables, the join would perform better if there was an index on (Country,Date),
otherwise it uses table scan most likely to dig up the values.
add an index, and actually look at the execution plan before and after, i think you'll get a better idea of what's going on.
Lowell
December 9, 2010 at 11:52 pm
Hi
Go for the Inner Join instead of IN clause, certainly the performance will benefit.
December 10, 2010 at 1:59 am
You can use where exist i think that will be faster
select * from Table a where exists(select 1 from V_LOAD_COMBINATION where V_LOAD_COMBINATION.Date = a.Date and V_LOAD_COMBINATION.Country = a.Country )
December 15, 2010 at 5:01 am
select * FROM Table a
where
a.Country in
(select Country from V_LOAD_COMBINATION and
a.Date in
(select Date from V_LOAD_COMBINATION )
------- INNER JOIN
select * FROM Table a
inner join V_LOAD_COMBINATION b
on
a.Country= b.Country and
a.Date = b.Date
If you closely see both queries will return different results. Please check your need.
As far as performance is concerned its better to use inner join with an index on country, and date columns of table a. If the related columns were also part an index in the view will add more performance
December 15, 2010 at 6:31 am
As per my experience , the answer is depends. Please do all the options like Exists, in and join and find out the best plan suits your query.
December 15, 2010 at 6:42 am
To answer your question...
JOINS always perform better than IN clause. Why ? IN clause is basically performs sub query funcitonality, for each row in outer query it execute the innser query. This makes the query to perform badly. Where as in JOINs it retrives only those matching recrods based on the argument.
Here in your example the simple query with equal to argument search for exact match. In clause query is qorking as explain above. and the 3rd query using view and table JOIN is retriving the records based on matching criteria. i.e. JOIN. Also view is just and representation of Query, when you use View internally it executes the query inside the view except in case of Indexed Views.
I hope this makes clear to your question. Let us know if need more clarification...
Abhijit - http://abhijitmore.wordpress.com
December 15, 2010 at 6:49 am
Could depend on the view statement as well... You its small, but is the SQL complex??
December 15, 2010 at 6:59 am
Abhijit More (12/15/2010)
To answer your question...JOINS always perform better than IN clause. Why ? IN clause is basically performs sub query funcitonality, for each row in outer query it execute the innser query. This makes the query to perform badly. Where as in JOINs it retrives only those matching recrods based on the argument.
Here in your example the simple query with equal to argument search for exact match. In clause query is qorking as explain above. and the 3rd query using view and table JOIN is retriving the records based on matching criteria. i.e. JOIN. Also view is just and representation of Query, when you use View internally it executes the query inside the view except in case of Indexed Views.
I hope this makes clear to your question. Let us know if need more clarification...
Careful...here's a couple of sample queries modified from a similar thread:
DROP TABLE #po
DROP TABLE #ag
CREATE TABLE #ag (agreementid INT IDENTITY (1,1), agreementnumber CHAR(3))
CREATE TABLE #po (poid INT, ponumber VARCHAR(5), agreementid INT, potype CHAR(2), usagedate DATETIME)
insert into #po(poid, ponumber,agreementid,potype,usagedate) values(1, 'po1',2,'NE',dateadd(mm,-5,getdate()))
insert into #po(poid, ponumber,agreementid,potype,usagedate) values(2, 'po2',2,'NE',dateadd(mm,-5,getdate()))
insert into #po(poid, ponumber,agreementid,potype,usagedate)values(3, 'zupo1',2,'ZU',dateadd(mm,-5,getdate()))
insert into #po(poid, ponumber,agreementid,potype,usagedate) values(4, 'zu3',1,'ZU',dateadd(mm,-5,getdate()))
insert into #po(poid, ponumber,agreementid,potype,usagedate) values(5, 'zu4',1,'ZU',dateadd(mm,-5,getdate()))
insert into #po(poid, ponumber,agreementid,potype,usagedate)values(NULL,'zu5',1,'ZU',dateadd(mm,-5,getdate()))
insert into #ag(agreementnumber) values('Ag1')
insert into #ag(agreementnumber) values('Ag2')
insert into #ag(agreementnumber) values('Ag3')
insert into #ag(agreementnumber) values('Ag4')
insert into #ag(agreementnumber) values('Ag5')
select po.agreementid, po.ponumber
from #po po
INNER join #ag ag on po.agreementid = ag.agreementid
select po.agreementid, po.ponumber
from #po po
WHERE po.agreementid IN (SELECT agreementid FROM #ag)
Run the two queries together in a batch using SQL2K8 and query 1 with the INNER JOIN has a cost of 78% (relative to the batch of the two queries). Now look at the plans - the operator in query 1 is a hash match IJ as you'd expect, costing 73%. In query 2 it's a nested loops left semi join, costing 1%.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 15, 2010 at 11:35 pm
December 16, 2010 at 11:57 pm
Lowell (12/9/2010)
add an index, and actually look at the execution plan before and after, i think you'll get a better idea of what's going on.
David, as Lowell has mentioned, add a composite non-clustered index on country, date.
This alone, will not speed up the query. Do you really need to 'select *'? Even though a NC index (covering index) is present, as you are selecting all columns, sql server will do lookups to read the rest of the columns. The lookups will be done one row at a time. And the lookups will be done only if the total number of rows in the result set is less than say 2% of the rows in the table. Otherwise, sql server goes for a table scan as it will be more efficient.
In short, add a composite index. Select only the columns you need. Check the execution plan to ascertain if table scan or index scan happens. Consider adding included columns.
- arjun
https://sqlroadie.com/
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply