select a from b where Int_field in (1, 2, 3, 4, 5,.......1000)

Question

select a from b where Int_field in (1, 2, 3, 4, 5,.......1000)

Jennifer Wang-317627

Old Hand

Points: 359
More actions
May 24, 2006 at 7:45 pm

#117735

We have select queries which has to do something like:
select somefield from table_a where Int_field in (1, 2, 3, 4, 5,.......1000) --- we are selecting on 1000+ IDs out of the table of about 1 million records (Of course the numbers are not consecutive like I showed). We do have clustered key built on that field but it's still slow (> 2 seconds). Is there any way we can improve this?
Thanks!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Paul Mu SSCertifiable Points: 5547 More actions · Answer 1

From your example above, are you building the select statement as dynamic SQL? Is the SQL part of a SP that is recompiling? On average how many records are returned? How is table fragmentation, and is the table statistics up to date?

These are some of the considerations that you would need to look into. Other factors are more server related, such as how much memory allocated to SQL Server? How many CPUs, and what is the processor like?

A possible alternative to gauge any difference in performance is to create a table variable and insert the values in the IN () clause to that table variable. Then either use an INNER JOIN or simply use a SELECT statement as part of the IN () clause, ie.

declare @tbl table (int_field int)

--code to insert int_field into table variable

select a from b where int_field IN (select int_field from @tbl)

OR

select a from b inner join @tbl t on b.int_field = t.int_field

kevin mann Hall of Fame Points: 3894 More actions · Answer 2

if your numbers are in sequence...

select *

from table

where fld between @start AND @end

Jennifer Wang-317627 Old Hand Points: 359 More actions · Answer 3

Thanks for the input guys... However, the IDs passed from a list from front end and are not stored in a table nor they are in sequence.... Also we would like to avoid temp tables if possible. Not sure if there is a better way to go around this problem.

PW-201837 SSC-Insane Points: 20805 More actions · Answer 4

If you have no control over the comma-separated list coming from the front-end, then you best bet is to transform it into an XML document (easier than it sounds) and join to the XML as if it were a table.

Below is a working example. Replace @TestList with whatever is coming from the front end and @TestData with your 1million row table:

Declare @xmldoc int

Declare @TestList as varchar(1000)

Select @TestList = '1,3,5,9'

Select @TestList = '<ROOT><List ID="' + Replace(@TestList, ',', '"></List><List ID="') + '"></List></ROOT>'

Select @TestList

exec sp_xml_preparedocument @xmldoc OUTPUT, @TestList

Declare @TestData Table (IDColumn int )

Insert Into @TestData

Select 1 Union All

Select 2 Union All

Select 3 Union All

Select 4 Union All

Select 5 Union All

Select 6 Union All

Select 7 Union All

Select 8 Union All

Select 9 Union All

Select 10

Select *

From @TestData as t

Inner Join OPENXML (@xmldoc, '/ROOT/List',1) WITH (ID int) As IDList

On (t.IDColumn = IDList.ID)

Philip Yale-193937 SSCarpal Tunnel Points: 4603 More actions · Answer 5

I would strongly recommend that you DON'T use a massive IN list like you are proposing. I have seen this break a server (stack overflow) - admittedly the person who did this generated >2,500 elements in the list, but I imagine that some of yours may come close.

It's also horribly inefficient. What you're actually saying is:

WHERE int_field = 1

OR int_field = 2

OR int_field = 3

...

each of which requires a separate index lookup (or worse, a tablescan). If you are tablescanning on this column, then you could find yourself in trouble (an index seek wouldn't be so bad, but it's still bad practice!).

Why are you so against temp tables? By putting the data in a table you can do a proper join against other tables, rather than sequential lookups, and this is much more efficient.