March 18, 2004 at 6:33 am
Hi forum,
quite often we use stored procedures to select either subset of all rows based on some value or all values in the table. For example, lets say, we have table address with column state (indexed, of course, often clustered). We want to be able to select addresses either for a particular state or all addresses. So we want stored procedure not unlike this one:
create procedure getAddressByState@state char(3) = nullasselect * from address wherestate = @state or @state is null
What I'm wondering about is how to optimise WHERE clause. For example, I can write:
create procedure getAddressByState@state char(3) = nullasselect * from address whereisnull(@state, state) = state
or even I can write:
create procedure getAddressByState@state char(3) = nullasif @state is null select * from addresselse select * from address where state = @state
I wonder which one will perform better.
Any thoughts are appreciated
George
PS. For pedants:
--
George
March 18, 2004 at 9:26 am
I would go with your third option above. The first option will needlessly scan through your address table looking for rows where STATE = null when all you really want is all the rows in the table.
The second option will be inefficient as well, because even if @state is not null, by wrapping your seach column (state) in a function (in this case the isnull() function), you disallow the optimizer from using any index that might exist on the state column, so you get a table scan. (You should always avoid using a function on the left side of your WHERE clauses if you want to make use of any available index on a column. Also avoid doing mathematical operations on the left side for the same reason.)
Option 3 looks like the most efficient way to go, IMHO.
March 18, 2004 at 4:26 pm
Frank,
I think SQL Server optimiser deserves a little more credit. In the sample below I use real table from my database. Table contains about 80k rows and has a clustered index on renled_parent_id. The output with SET SHOWPLAN_ALL ON
declare @tenant int set @tenant = nullselect * from rent_ledger where renled_parent_id = @tenant or @tenant is null|--Bookmark Lookup(BOOKMARK: ([Bmk1000]), OBJECT: ([primelink].[dbo].[rent_ledger]))
|--Index Scan(OBJECT: ([primelink].[dbo].[rent_ledger].[constr_pk_318]), WHERE: ([rent_ledger].[renled_parent_id]=[@tenant] OR [@tenant]=NULL))
select * from rent_ledger where isnull(@tenant,renled_parent_id) = renled_parent_id|--Clustered Index Scan(OBJECT: ([primelink].[dbo].[rent_ledger].[index_319]), WHERE: (isnull([@tenant], [rent_ledger].[renled_parent_id])=[rent_ledger].[renled_parent_id]))
if @tenant is nullselect * from rent_ledger
|--Clustered Index Scan(OBJECT: ([primelink].[dbo].[rent_ledger].[index_319]))
else
select * from rent_ledger where renled_parent_id = @tenant|--Clustered Index Seek(OBJECT: ([primelink].[dbo].[rent_ledger].[index_319]), SEEK: ([rent_ledger].[renled_parent_id]=[@tenant]) ORDERED FORWARD)
As you can see, there is no table scan in sight and optimiser can still use index because isnull is sargable predicate, AFAIK. Cost-wise the "else" part of the third query is the fastest with clustered index seek but I have to weight it against frequency of requests with parameter set to null.
Cheers
--
George
March 19, 2004 at 7:23 am
^^^ Interesting. I've read several articles that stated that using functions around search arguments prevents the optimizer from using indexes. (Including an article or two referenced on this site!) Apparently that is not such a clear cut rule as some would make it out to be. Thanks for clearing up the misconception.
March 23, 2004 at 4:14 am
georged,
Be careful! Clustered Index Scan meens in fact table scan!
Try it out by removing the clustered index and creating a nonclustered on instead. You can see the difference.
Also check why with your first query the optimiser is using the constr_pk_318 index while with the second one the index named index_319.
An another proof is your third query where you have a select without where clause. Obviously there will be no index usage. Never the less you can see a clustered index scan, which in fact is a table scan. This is because with a clustered index the data itself is the index.
Bye
Gabor
March 23, 2004 at 4:45 am
To ensure you get all data in the case of a null value for the variable you need to use either the 2nd or 3rd. Now I have seen situations where ISNULL(@var,value) = value does and doesn't impact speed it can make a difference as to if the value column is the clustered inex or not. Personally I always opt for the following.
create procedure getAddressByState;1@state char(3) = nullas set nocount on if @state is null Exec getAddressByState;2else Exec getAddressByState;3 @state go
create procedure getAddressByState;2as set nocount on select (columnlist) from address go
create procedure getAddressByState;3@state char(3)as set nocount onselect (columnlist) from address where state = @state go
This way I get a seperate stored execution plan for each condition on first run that is optimized for it.
March 23, 2004 at 12:39 pm
Gabor, good point. I have no clue as to why first query uses constr_pk_318. Just to clarify, index_319 is a clustered index on renled_parent_id column (int) and constr_pk_318 is a primary key on renlen_id column which is also an int. Wonder if substituting real column list instead of * will make any difference to the plan (though most of the table columns are selected)
Antares686, thanks, really like your approach - I'll give it some thoughts. Forgot to mention that column is not null so first query will still work .
--
George
March 24, 2004 at 1:37 am
In this case get it pragmatically.
select * from rent_ledger
where renled_parent_id = @tenantselect * from rent_ledger
Bye
Gabor
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply