Just a thought: Which statement is faster?

Question

Just a thought: Which statement is faster?

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply

Antares686 SSC Guru Points: 125444 More actions · Answer 1

This will in many cases be the situation. However there can be various factors that can effect the performance. Sometimes you have to try it several ways to get the best option and as the system grows this can even change.

noeld SSC Guru Points: 96590 More actions · Answer 2

1.First and foremost INDEXES DO MATTER!!!

2.Second, the number of rows must not be trivial

3.And Last but not least the HW in some cases plays a major role.

In summary you could forget about HW for this one but point 1 and 2 can not be taken lightly.

HTH

* Noel

rhunt SSCertifiable Points: 5868 More actions · Answer 3

Yes, I agree.

Regarding point 1: I understand that indexes will affect the access plan and execution time... I intentionally left off any indexes because I wanted both to table scan. Essentially, I was trying to illustrate for myself the true value of the boolean return on "EXISTS" versus the set/subquery of "IN". I was primarly surprised that EXISTS took longer during several query runs.

Point 2: I'm not sure what you mean. Are you saying I need a larger table to see the performance return on EXISTS?

Point 3: I have heard that the optimizer will sometimes choose different access plans for same structure databases that reside on different servers. I haven't had much time to play with that though...

noeld SSC Guru Points: 96590 More actions · Answer 4

>>Point 2: I'm not sure what you mean. Are you saying I need a larger table to see the performance return on EXISTS<<

ABSOLUTELY! if the number of rows is small enough so that a table scan is always the only choice probably IN will be better

NOW, Set up an index and populate the table with a decent amount of rows 10k+ for low end HW 100K+ for mid 100mil + for High end

Now compare: join, in and exists

BTW - This numbers change every day with technology improvements -

HTH

* Noel

rhunt SSCertifiable Points: 5868 More actions · Answer 5

OK, I'm seeing a small improvement for "EXISTS" now. FYI, the distinct keyword was used below because too much time was lost to the network in transporting 500,000 records to my laptop.

CREATE TABLE test (ID integer, SomeChar varchar(10), SomeOtherChar varchar(10), SomeInteger integer)

GO

CREATE CLUSTERED INDEX [IX_test_ID_Clustered] ON [dbo].[test]([ID]) WITH FILLFACTOR = 90

GO

DECLARE @intCount integer

set @intCount = 0

WHILE @intCount < 100000

BEGIN

INSERT INTO TEST VALUES (100, 'CharString', 'Whatever' ,3200)

INSERT INTO TEST VALUES (200, 'SomeString', 'Blah', 3600)

INSERT INTO TEST VALUES (300, 'AnotherStr', 'BlahBlah', 3800)

INSERT INTO TEST VALUES (400, 'Whatever', 'character', 4000)

INSERT INTO TEST VALUES (500, 'finstring', 'final', 4200)

set @intCount=@intCount+1

END

SELECT distinct ID FROM test t1 where ID IN ( Select ID FROM test )

--SQL:StmtCompleted 59 49 170 0

-- Cumulative client processing time 508 520.333

-- Cumulative wait time on server replies 27492 7.42505e+006

SELECT distinct ID FROM test t1 where EXISTS ( SELECT * FROM test t2 WHERE t2.ID = t1.ID )

--SQL:StmtCompleted 49 39 170 0

--Cumulative client processing time 559 530

--Cumulative wait time on server replies 48045 5.5808e+006

noeld SSC Guru Points: 96590 More actions · Answer 6

OK,

There is a problem here also in the data used to simulate this. If you use what you posted, the cardinality of the index will be so low that I doubt that that index is even usable I don't have a sql machine now but I am positive that is barely usefull. If you want to simulate this think in terms of Orders and OrderDetails and that's a typical scenario for this cases. Remember knowing about your data is also a valuable tool

HTH

* Noel