Self Eliminated Parameters

  • Comments posted here are about the content posted at temp

  • This technique, although interesting, should be used sparingly as it affects performance. The optimizer will perform full table scans on all tables that have a smart filter on them. Therefore, I believe this technique is only acceptable if you're expecting your query to need full table scans anyway. Switch on Execution Plan viewing and inspect the execution plans for the SQL below. The SELECT with the smart filter will always read the entire Customers table, the SELECT you would normally arrive at doing it the conventional way (by building up your SQL according to the filter values that were not null or not the default value) will use a clustered index seek.

    DECLARE @CustomerID nchar(5)

    SET @CustomerID = 'AROUT'

    SELECT *

    FROM Customers

    WHERE (CustomerID = @CustomerID OR @CustomerID IS NULL)

    SELECT *

    FROM Customers

    WHERE CustomerID = @CustomerID

  • Another way i commonly use is an extension on the above

     

    WHERE

    ((@customerid is not null and customerid = @customerid) or @customerid is null)

  • If NULL is the default value used, the following construct can also be used:

    WHERE (Column_Name = ISNULL(@Param_Name, Column_Name))

  • Or the use of the SQL standard COALESCE. Example:

    WHERE Column_Name=COALESCE(@Param_Name, Column_Name)

  • I experimented with this many years ago, if tables are small the benefits are good but on anything else the performance is apalling in terms of i/o and data cache flush. The more optional parameters there are the worse it gets. However, if the proc doesn't get called too often e.g. it's not in an oltp system and you need to avoid dynamic sql because of the security issues and views don't do much then I agree it's a good solution.

    From my view in performance tuning developers usually find this and use it without testing on higher numbers of rows ( I won't say large table as it's all relative )  - so it works fine in test - in production tables get into double figure thousands of rows ( or worse ) and suddenly you're looking at a simple proc which is doing very high i/o to return ( usually ) 1 row. So basically what I'd say is this solution is good but it doesn't scale, so use with care.

    I'm not having a pop here Alex btw .. but this illustrates that you should test with execution plans when developing code solutions and remember to scale up and out as part of testing.

    I hope we don't get into another one of these silly dynamic sql arguments either!!

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Alex - were you aware of another of Erland's excellent articles which discusses just this problem (in a good deal of detail)?

    http://www.sommarskog.se/dyn-search.html

     

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Thanks for the tip, I've come at this problem in the past without any good solutions. For smaller tables, this is perfect!

  • how is this technique all that clever? If anything, using

    select *

    from Customers

    where customerid=isnull(@customerid,customerId)

     

    is more efficient, easier to read and takes up less space.

  • I found the example difficult to follow. To demonstrate the technique you could use one table with one WHERE without XML processing that is not easy for entry-level developers and professional level administrators and without joins. Also there is too much stress on NULL as a defult value. Do not forget that we mostly are in the described situation when users have to select something and they do it through the front end. The developers  often set default value for the dropdown boxes to something meaningful like "please, select the value from the list" or in many cases "All"

    Regards,Yelena Varsha

  • I do not beleive the author says the solution is clever. It is just offered as a solution to a common problem. Personally, I have used your technique as it works well for our environment. In some circumstances, we have used Dynamic SQL (Prior to learning this technique).

     

    Overall, the article examples should be simplified for beginners to follow. The technique is valid however.

  • Be careful with this one.

    we have tables with 6 million rows and doing

    WHERE (CustomerName = @CustomerName OR @CustomerName IS NULL)

    killed perfomance. The query took more than 2 minutes.

    User dynamic SQL with EXEC or SP_EXECUTESQL.

    watch out for single quotes with EXEC, to prevent SQL Injection and to deal with certain last names.

    O'Malley in SQL is 'O''Malley'

    IF @CustomerName IS NOT NULL

    BEGIN

        SET @sql = @sql + N' AND CustomerName = ''' + REPLACE(@CustomerName,'''','''''') + ''''

    END

  • I've found that that is NOT true... not with INT ID's anyway... In my testing of the various solutions posted against this article, I get INDEX SCAN's, not table scans using WHERE (CustID = @CustID OR @CustID IS NULL).

    In fact, the method advertised is the second fastest method (direct and dynamic SQL tied for first fastest) of those posted and some not posted.  Considering the flexibility of the code and the fact that you don't have to juggle the dynamic SQL code to make it "injection proof", it's not a bad method.

    Still, I do like the dynamic SQL because of the blazing speed even if I do have to add a little code to keep Gumby and his brothers out...

    quote

    how is this technique all that clever? If anything, using
    select *
    from Customers
    where customerid=isnull(@customerid,customerId)
    is more efficient, easier to read and takes up less space.

    More efficient?  Like a previous respondent stated, you may want to test the code before making those type of statements.  The "efficient" method you spoke of forces a table scan resulting in a 43 second time of return.  The method Alex wrote about (colname=@variable OR @variable IS NULL) only took 453 MILLI-seconds.  That's about 94 times faster give or take a clock cycle.

    Here's the code I used on the lowly 1.4 million record Customer table on the Test Server at work... some of the print statements also serve as the documentation for what I found for performance on a quiet server...

    DECLARE @StartTime DATETIME

    DECLARE @CustID INT

    SET @CustID = 1650

      PRINT 'BASELINE... simple direct lookup (0.016 seconds - Index Seek)'

       DBCC DROPCLEANBUFFERS

       DBCC FREEPROCCACHE

        SET @StartTime = GETDATE()

     SELECT *

       FROM dbo.Customer WITH (NOLOCK)

      WHERE CustID = @CustID

      PRINT DATEDIFF (ms,@StartTime,GETDATE())

      PRINT REPLICATE('-',78)

      PRINT 'Using the technique from the article (0.453 seconds - Index Scan)'

       DBCC DROPCLEANBUFFERS

       DBCC FREEPROCCACHE

        SET @StartTime = GETDATE()

     SELECT *

       FROM dbo.Customer WITH (NOLOCK)

      WHERE (CustID = @CustID OR @CustID IS NULL)

      PRINT DATEDIFF (ms,@StartTime,GETDATE())

      PRINT REPLICATE('-',78)

      PRINT 'Using dynamic sql (extendable) (0.016 seconds - Index Seek)'

       DBCC DROPCLEANBUFFERS

       DBCC FREEPROCCACHE

    DECLARE @sql VARCHAR(8000)

        SET @StartTime = GETDATE()

        SET @sql = '

                    SELECT *

                    FROM dbo.Customer WITH (NOLOCK)

                    WHERE 1=1' + ISNULL(' AND CustID = '+STR(@CUSTID),'')

       EXEC (@SQL)

      PRINT DATEDIFF (ms,@StartTime,GETDATE())

      PRINT REPLICATE('-',78)

      PRINT 'Using ISNULL (one of the alternatives offered) (43.110 seconds - Table Scan)'

       DBCC DROPCLEANBUFFERS

       DBCC FREEPROCCACHE

        SET @StartTime = GETDATE()

     SELECT *

       FROM dbo.Customer WITH (NOLOCK)

      WHERE CustID = ISNULL(@CustID,CustID) --REAL SLOW!!! TABLE SCAN!!!

      PRINT DATEDIFF (ms,@StartTime,GETDATE())

      PRINT REPLICATE('-',78)

      PRINT 'Using semi-static method (42.703 seconds - Table Scan)'

       DBCC DROPCLEANBUFFERS

       DBCC FREEPROCCACHE

    DECLARE @IntMin INT

    DECLARE @IntMax INT

        SET @IntMin = -2147483648

        SET @IntMax =  2147483647

        SET @StartTime = GETDATE()

     SELECT *

       FROM dbo.Customer WITH (NOLOCK)

      WHERE CustID BETWEEN ISNULL(@CustID,@IntMin) AND ISNULL(@CustID,@IntMax)

      PRINT DATEDIFF (ms,@StartTime,GETDATE())

      PRINT REPLICATE('-',78)

      PRINT 'Using CASE method (42.610 seconds - Table Scan)'

       DBCC DROPCLEANBUFFERS

       DBCC FREEPROCCACHE

        SET @StartTime = GETDATE()

     SELECT *

       FROM dbo.Customer WITH (NOLOCK)

      WHERE CustID = CASE WHEN @CustID IS NULL THEN CustID ELSE @CustID END

      PRINT DATEDIFF (ms,@StartTime,GETDATE())

      PRINT REPLICATE('-',78)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Alex,

    Nice article especially for the proverbial "newbie".  I do agree that the examples were a bit overdone but how are ya going to learn if you don't try?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Note that it has nothing to do with the fact that it's an integer ID btw. Because we're looking for CustomerID, which is the clustered index, and every non-clustered index contains the clustered index column(s) as the last column(s), SQL Server will scan the smallest index, in this case the City index. If CustomerID was a varchar, the execution plan would be just the same.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply