To OR or not to OR

Question

To OR or not to OR

georgedo

Right there with Babe

Points: 751
More actions
March 18, 2004 at 6:33 am

#110661
Hi forum,
quite often we use stored procedures to select either subset of all rows based on some value or all values in the table. For example, lets say, we have table address with column state (indexed, of course, often clustered). We want to be able to select addresses either for a particular state or all addresses. So we want stored procedure not unlike this one:
create procedure getAddressByState
@state char(3) = null
as
select * from address where
state = @state or @state is null
What I'm wondering about is how to optimise WHERE clause. For example, I can write:
create procedure getAddressByState
@state char(3) = null
as
select * from address where
isnull(@state, state) = state
or even I can write:
create procedure getAddressByState
@state char(3) = null
as
if @state is null
   select * from address
else
   select * from address
   where state = @state
I wonder which one will perform better.
Any thoughts are appreciated
George
PS. For pedants:
1. * is selected only for brevity.
2. We have 3-character state abbreviations in Australia
--
George

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Frank The Tank SSC Eights! Points: 870 More actions · Answer 1

I would go with your third option above. The first option will needlessly scan through your address table looking for rows where STATE = null when all you really want is all the rows in the table.

The second option will be inefficient as well, because even if @state is not null, by wrapping your seach column (state) in a function (in this case the isnull() function), you disallow the optimizer from using any index that might exist on the state column, so you get a table scan. (You should always avoid using a function on the left side of your WHERE clauses if you want to make use of any available index on a column. Also avoid doing mathematical operations on the left side for the same reason.)

Option 3 looks like the most efficient way to go, IMHO.

georgedo Right there with Babe Points: 751 More actions · Answer 2

Frank,

I think SQL Server optimiser deserves a little more credit. In the sample below I use real table from my database. Table contains about 80k rows and has a clustered index on renled_parent_id. The output with SET SHOWPLAN_ALL ON

declare @tenant int
set @tenant = null
select * from rent_ledger
where renled_parent_id = @tenant or @tenant is null
|--Bookmark Lookup(BOOKMARK: ([Bmk1000]), OBJECT: ([primelink].[dbo].[rent_ledger]))
       |--Index Scan(OBJECT: ([primelink].[dbo].[rent_ledger].[constr_pk_318]), WHERE: ([rent_ledger].[renled_parent_id]=[@tenant] OR [@tenant]=NULL))

select * from rent_ledger
where isnull(@tenant,renled_parent_id) = renled_parent_id
|--Clustered Index Scan(OBJECT: ([primelink].[dbo].[rent_ledger].[index_319]), WHERE: (isnull([@tenant], [rent_ledger].[renled_parent_id])=[rent_ledger].[renled_parent_id]))

if @tenant is null
   select * from rent_ledger
       |--Clustered Index Scan(OBJECT: ([primelink].[dbo].[rent_ledger].[index_319]))
else
   select * from rent_ledger where    renled_parent_id = @tenant
       |--Clustered Index Seek(OBJECT: ([primelink].[dbo].[rent_ledger].[index_319]), SEEK: ([rent_ledger].[renled_parent_id]=[@tenant]) ORDERED FORWARD)

As you can see, there is no table scan in sight and optimiser can still use index because isnull is sargable predicate, AFAIK. Cost-wise the "else" part of the third query is the fastest with clustered index seek but I have to weight it against frequency of requests with parameter set to null.

Cheers

--
George

Frank The Tank SSC Eights! Points: 870 More actions · Answer 3

^^^ Interesting. I've read several articles that stated that using functions around search arguments prevents the optimizer from using indexes. (Including an article or two referenced on this site!) Apparently that is not such a clear cut rule as some would make it out to be. Thanks for clearing up the misconception.

Gabor Nyul SSCrazy Eights Points: 9459 More actions · Answer 4

georged,

Be careful! Clustered Index Scan meens in fact table scan!

Try it out by removing the clustered index and creating a nonclustered on instead. You can see the difference.

Also check why with your first query the optimiser is using the constr_pk_318 index while with the second one the index named index_319.

An another proof is your third query where you have a select without where clause. Obviously there will be no index usage. Never the less you can see a clustered index scan, which in fact is a table scan. This is because with a clustered index the data itself is the index.

Bye
Gabor

Antares686 SSC Guru Points: 125444 More actions · Answer 5

To ensure you get all data in the case of a null value for the variable you need to use either the 2nd or 3rd. Now I have seen situations where ISNULL(@var,value) = value does and doesn't impact speed it can make a difference as to if the value column is the clustered inex or not. Personally I always opt for the following.

create procedure getAddressByState;1
@state char(3) = null
as

set nocount on

if @state is null
Exec getAddressByState;2
else
Exec getAddressByState;3 @state

go

create procedure getAddressByState;2
as

set nocount on

select (columnlist) from address

go

create procedure getAddressByState;3
@state char(3)
as

set nocount on

select (columnlist) from address

where state = @state

go

This way I get a seperate stored execution plan for each condition on first run that is optimized for it.

georgedo Right there with Babe Points: 751 More actions · Answer 6

Gabor, good point. I have no clue as to why first query uses constr_pk_318. Just to clarify, index_319 is a clustered index on renled_parent_id column (int) and constr_pk_318 is a primary key on renlen_id column which is also an int. Wonder if substituting real column list instead of * will make any difference to the plan (though most of the table columns are selected)

Antares686, thanks, really like your approach - I'll give it some thoughts. Forgot to mention that column is not null so first query will still work .

--
George

Gabor Nyul SSCrazy Eights Points: 9459 More actions · Answer 7

In this case get it pragmatically.

Check if your statistics are up to date (update statistics)
check the fragmentation status (dbcc showcontig, then dbcc dbreindex if neccessairy)
issue a single query like
select * from rent_ledger
where renled_parent_id = 1111
Check the plan
issue a second single query like this.
declare @tenant int
set @tenant = 1111
select * from rent_ledger
where renled_parent_id = @tenant
Check the plan

issue a third single query like this.

declare @tenant int

set @tenant = 1111

select * from rent_ledger

where renled_parent_id = @tenant or @tenant is null

Check the plan
All three queries should produce the same plan

As an info the following query should not use any index because @tenant is null:

declare @tenant int

set @tenant = null

select * from rent_ledger

where renled_parent_id = @tenant or @tenant is null

Bye
Gabor