May 24, 2006 at 7:45 pm
We have select queries which has to do something like:
select somefield from table_a where Int_field in (1, 2, 3, 4, 5,.......1000) --- we are selecting on 1000+ IDs out of the table of about 1 million records (Of course the numbers are not consecutive like I showed). We do have clustered key built on that field but it's still slow (> 2 seconds). Is there any way we can improve this?
Thanks!
May 24, 2006 at 9:55 pm
From your example above, are you building the select statement as dynamic SQL? Is the SQL part of a SP that is recompiling? On average how many records are returned? How is table fragmentation, and is the table statistics up to date?
These are some of the considerations that you would need to look into. Other factors are more server related, such as how much memory allocated to SQL Server? How many CPUs, and what is the processor like?
A possible alternative to gauge any difference in performance is to create a table variable and insert the values in the IN () clause to that table variable. Then either use an INNER JOIN or simply use a SELECT statement as part of the IN () clause, ie.
declare @tbl table (int_field int)
--code to insert int_field into table variable
select a from b where int_field IN (select int_field from @tbl)
OR
select a from b inner join @tbl t on b.int_field = t.int_field
May 26, 2006 at 1:24 pm
if your numbers are in sequence...
select *
from table
where fld between @start AND @end
May 26, 2006 at 2:01 pm
Thanks for the input guys... However, the IDs passed from a list from front end and are not stored in a table nor they are in sequence.... Also we would like to avoid temp tables if possible. Not sure if there is a better way to go around this problem.
May 26, 2006 at 3:06 pm
If you have no control over the comma-separated list coming from the front-end, then you best bet is to transform it into an XML document (easier than it sounds) and join to the XML as if it were a table.
Below is a working example. Replace @TestList with whatever is coming from the front end and @TestData with your 1million row table:
Declare @xmldoc int
Declare @TestList as varchar(1000)
Select @TestList = '1,3,5,9'
Select @TestList = '<ROOT><List ID="' + Replace(@TestList, ',', '"></List><List ID="') + '"></List></ROOT>'
Select @TestList
exec sp_xml_preparedocument @xmldoc OUTPUT, @TestList
Declare @TestData Table (IDColumn int )
Insert Into @TestData
Select 1 Union All
Select 2 Union All
Select 3 Union All
Select 4 Union All
Select 5 Union All
Select 6 Union All
Select 7 Union All
Select 8 Union All
Select 9 Union All
Select 10
Select *
From @TestData as t
Inner Join OPENXML (@xmldoc, '/ROOT/List',1) WITH (ID int) As IDList
On (t.IDColumn = IDList.ID)
June 5, 2006 at 11:00 am
I would strongly recommend that you DON'T use a massive IN list like you are proposing. I have seen this break a server (stack overflow) - admittedly the person who did this generated >2,500 elements in the list, but I imagine that some of yours may come close.
It's also horribly inefficient. What you're actually saying is:
WHERE int_field = 1
OR int_field = 2
OR int_field = 3
...
each of which requires a separate index lookup (or worse, a tablescan). If you are tablescanning on this column, then you could find yourself in trouble (an index seek wouldn't be so bad, but it's still bad practice!).
Why are you so against temp tables? By putting the data in a table you can do a proper join against other tables, rather than sequential lookups, and this is much more efficient.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply