October 8, 2006 at 8:57 pm
This technique, although interesting, should be used sparingly as it affects performance. The optimizer will perform full table scans on all tables that have a smart filter on them. Therefore, I believe this technique is only acceptable if you're expecting your query to need full table scans anyway. Switch on Execution Plan viewing and inspect the execution plans for the SQL below. The SELECT with the smart filter will always read the entire Customers table, the SELECT you would normally arrive at doing it the conventional way (by building up your SQL according to the filter values that were not null or not the default value) will use a clustered index seek.
DECLARE @CustomerID nchar(5)
SET @CustomerID = 'AROUT'
SELECT *
FROM Customers
WHERE (CustomerID = @CustomerID OR @CustomerID IS NULL)
SELECT *
FROM Customers
WHERE CustomerID = @CustomerID
October 8, 2006 at 11:49 pm
Another way i commonly use is an extension on the above
WHERE
((@customerid is not null and customerid = @customerid) or @customerid is null)
October 9, 2006 at 12:27 am
If NULL is the default value used, the following construct can also be used:
WHERE (Column_Name = ISNULL(@Param_Name, Column_Name))
October 9, 2006 at 2:10 am
Or the use of the SQL standard COALESCE. Example:
WHERE Column_Name=COALESCE(@Param_Name, Column_Name)
October 9, 2006 at 3:17 am
I experimented with this many years ago, if tables are small the benefits are good but on anything else the performance is apalling in terms of i/o and data cache flush. The more optional parameters there are the worse it gets. However, if the proc doesn't get called too often e.g. it's not in an oltp system and you need to avoid dynamic sql because of the security issues and views don't do much then I agree it's a good solution.
From my view in performance tuning developers usually find this and use it without testing on higher numbers of rows ( I won't say large table as it's all relative ) - so it works fine in test - in production tables get into double figure thousands of rows ( or worse ) and suddenly you're looking at a simple proc which is doing very high i/o to return ( usually ) 1 row. So basically what I'd say is this solution is good but it doesn't scale, so use with care.
I'm not having a pop here Alex btw .. but this illustrates that you should test with execution plans when developing code solutions and remember to scale up and out as part of testing.
I hope we don't get into another one of these silly dynamic sql arguments either!!
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
October 9, 2006 at 3:53 am
Alex - were you aware of another of Erland's excellent articles which discusses just this problem (in a good deal of detail)?
http://www.sommarskog.se/dyn-search.html
Ryan Randall
Solutions are easy. Understanding the problem, now, that's the hard part.
October 9, 2006 at 7:02 am
Thanks for the tip, I've come at this problem in the past without any good solutions. For smaller tables, this is perfect!
October 9, 2006 at 7:18 am
how is this technique all that clever? If anything, using
select *
from Customers
where customerid=isnull(@customerid,customerId)
is more efficient, easier to read and takes up less space.
October 9, 2006 at 9:03 am
I found the example difficult to follow. To demonstrate the technique you could use one table with one WHERE without XML processing that is not easy for entry-level developers and professional level administrators and without joins. Also there is too much stress on NULL as a defult value. Do not forget that we mostly are in the described situation when users have to select something and they do it through the front end. The developers often set default value for the dropdown boxes to something meaningful like "please, select the value from the list" or in many cases "All"
Regards,Yelena Varsha
October 9, 2006 at 10:32 am
I do not beleive the author says the solution is clever. It is just offered as a solution to a common problem. Personally, I have used your technique as it works well for our environment. In some circumstances, we have used Dynamic SQL (Prior to learning this technique).
Overall, the article examples should be simplified for beginners to follow. The technique is valid however.
October 9, 2006 at 6:14 pm
Be careful with this one.
we have tables with 6 million rows and doing
WHERE (CustomerName = @CustomerName OR @CustomerName IS NULL)
killed perfomance. The query took more than 2 minutes.
User dynamic SQL with EXEC or SP_EXECUTESQL.
watch out for single quotes with EXEC, to prevent SQL Injection and to deal with certain last names.
O'Malley in SQL is 'O''Malley'
IF @CustomerName IS NOT NULL
BEGIN
SET @sql = @sql + N' AND CustomerName = ''' + REPLACE(@CustomerName,'''','''''') + ''''
END
October 9, 2006 at 7:39 pm
I've found that that is NOT true... not with INT ID's anyway... In my testing of the various solutions posted against this article, I get INDEX SCAN's, not table scans using WHERE (CustID = @CustID OR @CustID IS NULL).
In fact, the method advertised is the second fastest method (direct and dynamic SQL tied for first fastest) of those posted and some not posted. Considering the flexibility of the code and the fact that you don't have to juggle the dynamic SQL code to make it "injection proof", it's not a bad method.
Still, I do like the dynamic SQL because of the blazing speed even if I do have to add a little code to keep Gumby and his brothers out...
|
More efficient? Like a previous respondent stated, you may want to test the code before making those type of statements. The "efficient" method you spoke of forces a table scan resulting in a 43 second time of return. The method Alex wrote about (colname=@variable OR @variable IS NULL) only took 453 MILLI-seconds. That's about 94 times faster give or take a clock cycle.
Here's the code I used on the lowly 1.4 million record Customer table on the Test Server at work... some of the print statements also serve as the documentation for what I found for performance on a quiet server...
DECLARE @StartTime DATETIME
DECLARE @CustID INT
SET @CustID = 1650
PRINT 'BASELINE... simple direct lookup (0.016 seconds - Index Seek)'
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET @StartTime = GETDATE()
SELECT *
FROM dbo.Customer WITH (NOLOCK)
WHERE CustID = @CustID
PRINT DATEDIFF (ms,@StartTime,GETDATE())
PRINT REPLICATE('-',78)
PRINT 'Using the technique from the article (0.453 seconds - Index Scan)'
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET @StartTime = GETDATE()
SELECT *
FROM dbo.Customer WITH (NOLOCK)
WHERE (CustID = @CustID OR @CustID IS NULL)
PRINT DATEDIFF (ms,@StartTime,GETDATE())
PRINT REPLICATE('-',78)
PRINT 'Using dynamic sql (extendable) (0.016 seconds - Index Seek)'
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
DECLARE @sql VARCHAR(8000)
SET @StartTime = GETDATE()
SET @sql = '
SELECT *
FROM dbo.Customer WITH (NOLOCK)
WHERE 1=1' + ISNULL(' AND CustID = '+STR(@CUSTID),'')
EXEC (@SQL)
PRINT DATEDIFF (ms,@StartTime,GETDATE())
PRINT REPLICATE('-',78)
PRINT 'Using ISNULL (one of the alternatives offered) (43.110 seconds - Table Scan)'
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET @StartTime = GETDATE()
SELECT *
FROM dbo.Customer WITH (NOLOCK)
WHERE CustID = ISNULL(@CustID,CustID) --REAL SLOW!!! TABLE SCAN!!!
PRINT DATEDIFF (ms,@StartTime,GETDATE())
PRINT REPLICATE('-',78)
PRINT 'Using semi-static method (42.703 seconds - Table Scan)'
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
DECLARE @IntMin INT
DECLARE @IntMax INT
SET @IntMin = -2147483648
SET @IntMax = 2147483647
SET @StartTime = GETDATE()
SELECT *
FROM dbo.Customer WITH (NOLOCK)
WHERE CustID BETWEEN ISNULL(@CustID,@IntMin) AND ISNULL(@CustID,@IntMax)
PRINT DATEDIFF (ms,@StartTime,GETDATE())
PRINT REPLICATE('-',78)
PRINT 'Using CASE method (42.610 seconds - Table Scan)'
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET @StartTime = GETDATE()
SELECT *
FROM dbo.Customer WITH (NOLOCK)
WHERE CustID = CASE WHEN @CustID IS NULL THEN CustID ELSE @CustID END
PRINT DATEDIFF (ms,@StartTime,GETDATE())
PRINT REPLICATE('-',78)
--Jeff Moden
Change is inevitable... Change for the better is not.
October 9, 2006 at 7:49 pm
Alex,
Nice article especially for the proverbial "newbie". I do agree that the examples were a bit overdone but how are ya going to learn if you don't try?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 9, 2006 at 8:33 pm
Note that it has nothing to do with the fact that it's an integer ID btw. Because we're looking for CustomerID, which is the clustered index, and every non-clustered index contains the clustered index column(s) as the last column(s), SQL Server will scan the smallest index, in this case the City index. If CustomerID was a varchar, the execution plan would be just the same.
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply