>=x versus IN

Question

>=x versus IN

Ulysses_Infinity

SSCrazy

Points: 2588
More actions
December 9, 2011 at 5:54 am

#249397

Hi,
Please could someone help me understand why query A below is a lot slower than query B
A] select * from table where field >= 10 --- only values >=10 are 10 to 20 inclusive
B] select * from table where field in (10,11,12,13,14,15,16,17,18,19,20)
I am running them on a 116million row table and the execution plans seem to be the same
Thanks,
Jason

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Devendrakumar SSC-Forever Points: 42493 More actions · Answer 1

I assume it's going for Index Scan (if table has any index on that field). Your query requirements can be satisfied with Index Seek.

Try field = 10, it will seek on index. (Just trying to explain... I don’t want to change your business requirements :-)).

Cadavre SSC-Forever Points: 41690 More actions · Answer 2

Here's a 1 million row attempt to reproduce what you're describing.

BEGIN TRAN

SET NOCOUNT ON

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment

END

--1,000,000 Random rows of data

SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,

(ABS(CHECKSUM(NEWID())) % 20) + 1 AS number_1_to_20

INTO #testEnvironment

FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

PRINT '========== BASELINE =========='

SET STATISTICS TIME ON

SELECT COUNT(*) FROM #testEnvironment

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

PRINT '========== >= =========='

SET STATISTICS TIME ON

SELECT COUNT(*) FROM #testEnvironment

WHERE number_1_to_20 >= 10

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

PRINT '========== IN =========='

SET STATISTICS TIME ON

SELECT COUNT(*) FROM #testEnvironment

WHERE number_1_to_20 IN (10,11,12,13,14,15,16,17,18,19,20)

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

ROLLBACK

Which returns

========== BASELINE ==========

SQL Server Execution Times:

CPU time = 78 ms, elapsed time = 68 ms.

================================================================================

========== >= ==========

SQL Server Execution Times:

CPU time = 78 ms, elapsed time = 85 ms.

================================================================================

========== IN ==========

SQL Server Execution Times:

CPU time = 235 ms, elapsed time = 236 ms.

================================================================================

Let's add an index.

BEGIN TRAN

SET NOCOUNT ON

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment

END

--1,000,000 Random rows of data

SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,

(ABS(CHECKSUM(NEWID())) % 20) + 1 AS number_1_to_20

INTO #testEnvironment

FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

CREATE NONCLUSTERED INDEX [test_index]

ON #testEnvironment (number_1_to_20)

PRINT '========== BASELINE =========='

SET STATISTICS TIME ON

SELECT COUNT(*) FROM #testEnvironment

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

PRINT '========== >= =========='

SET STATISTICS TIME ON

SELECT COUNT(*) FROM #testEnvironment

WHERE number_1_to_20 >= 10

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

PRINT '========== IN =========='

SET STATISTICS TIME ON

SELECT COUNT(*) FROM #testEnvironment

WHERE number_1_to_20 IN (10,11,12,13,14,15,16,17,18,19,20)

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

ROLLBACK

========== BASELINE ==========

SQL Server Execution Times:

CPU time = 63 ms, elapsed time = 68 ms.

================================================================================

========== >= ==========

SQL Server Execution Times:

CPU time = 47 ms, elapsed time = 43 ms.

================================================================================

========== IN ==========

SQL Server Execution Times:

CPU time = 47 ms, elapsed time = 43 ms.

================================================================================

Since I obviously can't replicate your observations, I'm going to suggest that it's something on your database that is adversely affecting the queries.

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Ulysses_Infinity (12/9/2011)
A] select * from table where field >= 10 --- only values >=10 are 10 to 20 inclusive
B] select * from table where field in (10,11,12,13,14,15,16,17,18,19,20)

Did you run them both multiple times and ignore the first executions? If not, then the first would have been slower due to reading the data off disk anc caching it.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ChazMan SSCrazy Points: 2259 More actions · Answer 4

I see that you are measuring your duration, but to see the likely culprit of the difference, you should use:

SET STATISTICS IO ON

The info you'll get will likely show you where the IO issue is, if indeed the difference is IO.

If the durations are substantially different, you'll see a lot more logical reads in one of the queries. But, be aware of GilaMonster's comment on reading from disk vs cache.

If you want to force the disk reads, before each execution you could run:

DBCC DropCleanBuffers

That will force SQL to pull the information from disk since the data buffers will be emptied.

Just some thoughts on troubleshooting.

--------------------------------------------------------------------------
When you realize you've dug yourself into a hole....Step 1...stop digging.