August 16, 2007 at 6:40 am
Hi I have a web based reporting page that lets admin select from a multitude of different options to search against web traffic data. They can select different combinations of options and date ranges and partial/full textual searches to look for etc.
I have two tables with the same structure that hold the data TRAFFIC_LOG holds just the current days and TRAFFIC_LOG_HISTORY that holds all other days.
At the moment I have a stored proc that is trying to handle all these different combinations by using IF statements and case statements in the WHERE clause eg
column = CASE @val WHEN 0 THEN column ELSE @val END
Now someone told me that this is a bad way of doing it as it means the correct indexes won't be used and I did recently improve a report query the other day by rewriting it to not use that type of branching. They also said having IF statements within a procedure will negate the benefits of cached query plans.
I currently have my cluster on the Stamp and various non-clustered indexes on SitePk and adminPK,candPk,clientPk.
Currently when I look at the execution plans when running the proc it is using clustered index seeks
and index seeks on a non clustered (Session)
So here is the current proc
SET NOCOUNT ON;
SET DATEFORMAT YMD
DECLARE @IsToday bit,
@StartDate datetime,
@EndDate datetime
IF Datediff(day,@SearchStamp,getdate())=0
SET @IsToday = 1
ELSE
SET @IsToday = 0
--set up date params so that we use the clustered index on predicates
SET @StartDate = cast(datepart(year,@SearchStamp)as varchar(4)) + '-' + cast(datepart(month,@SearchStamp)as varchar(2)) + '-' + cast(datepart(day,@SearchStamp)as varchar(2)) + ' 00:00:00.000'
SET @EndDate = cast(datepart(year,@SearchStamp)as varchar(4)) + '-' + cast(datepart(month,@SearchStamp)as varchar(2)) + '-' + cast(datepart(day,@SearchStamp)as varchar(2)) + ' 23:59:59.997'
IF @IsToday=1
BEGIN
SELECT TOP (@MaxRecs) SessionID, Stamp,URL,Referer, SearchRequest,NoResults,Action,ActionIDs,AppMessages, IsHackAttempt, Checked, IsError, InfoCode,Querystring
FROM TRAFFIC_LOG
WHERE SitePk = @SitePK AND
Stamp >= @StartDate AND
Stamp <= @EndDate AND
CandPK = CASE @CandPK WHEN 0 THEN CandPK ELSE @CandPK END AND
ClientPK = CASE @ClientPK WHEN 0 THEN ClientPK ELSE @ClientPK END AND
AdminPK = CASE @AdminPK WHEN 0 THEN AdminPK ELSE @AdminPK END AND
IsError = CASE @IsError WHEN 0 THEN IsError ELSE 1 END AND
IsHackAttempt = CASE @IsHack WHEN 0 THEN IsHackAttempt ELSE 1 END AND
InAdminMode = CASE @InAdminMode WHEN 0 THEN InAdminMode ELSE 1 END AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'SessionID' THEN 1 WHEN @SearchIN = 'SessionID' AND SessionID LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'Browser' THEN 1 WHEN @SearchIN = 'Browser' AND Browser LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'OS' THEN 1 WHEN @SearchIN = 'OS' AND OS LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'URL' THEN 1 WHEN @SearchIN = 'URL' AND URL LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'Querystring' THEN 1 WHEN @SearchIN = 'Querystring' AND Querystring LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'AppMessages' THEN 1 WHEN @SearchIN = 'AppMessages' AND AppMessages LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'InfoCode' THEN 1 WHEN @SearchIN = 'InfoCode' AND InfoCode LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'ClientIP' THEN 1 WHEN @SearchIN = 'ClientIP' AND ClientIP LIKE @SearchFor THEN 1 ELSE 0 END = 1
ORDER BY SessionID, Stamp
END
ELSE
BEGIN
SELECT TOP (@MaxRecs) SessionID, Stamp,URL,Referer, SearchRequest,NoResults,Action,ActionIDs,AppMessages, IsHackAttempt, Checked, IsError, InfoCode,Querystring
FROM TRAFFIC_LOG_HISTORY
WHERE SitePk = @SitePK AND
Stamp >= @StartDate AND
Stamp <= @EndDate AND
IsError = CASE @IsError WHEN 0 THEN IsError ELSE 1 END AND
IsHackAttempt = CASE @IsHack WHEN 0 THEN IsHackAttempt ELSE 1 END AND
InAdminMode = CASE @InAdminMode WHEN 0 THEN InAdminMode ELSE 1 END AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'SessionID' THEN 1 WHEN @SearchIN = 'SessionID' AND SessionID LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'Browser' THEN 1 WHEN @SearchIN = 'Browser' AND Browser LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'OS' THEN 1 WHEN @SearchIN = 'OS' AND OS LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'URL' THEN 1 WHEN @SearchIN = 'URL' AND URL LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'Querystring' THEN 1 WHEN @SearchIN = 'Querystring' AND Querystring LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'AppMessages' THEN 1 WHEN @SearchIN = 'AppMessages' AND AppMessages LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'InfoCode' THEN 1 WHEN @SearchIN = 'InfoCode' AND InfoCode LIKE @SearchFor THEN 1 ELSE 0 END = 1 AND
CASE WHEN @SearchFor = '' OR @SearchIn <> 'ClientIP' THEN 1 WHEN @SearchIN = 'ClientIP' AND ClientIP LIKE @SearchFor THEN 1 ELSE 0 END = 1
ORDER BY SessionID, Stamp
END
Now before I decide whether to and if so how to rewrite it, can someone tell me what is currently happening regarding cached query plans for this procedure. Do query plans get generated for each distinct select within the procedure or for the proc as a whole?
As each branch of the IF is looking at a different table does the optimiser have 2 query plans (1 for each select) or does it only have one (for the whole proc) and just recompile when it find the plan is irrelevant?
If the answer is that the plan is for a proc as a whole then re-writing the report with multiple IF branches instead of CASEs may help with using the correct indexes on the predicates but is not going to help with re-use of cached plans as it will always get recomplied when run won't it?
Now if I can't use IF branches if I want to re-use query plans that means I will have to use dynamic SQL to build up the SELECTs to use the correct predicates either on the application or on the server using sp_executesql.
Now I don't want a debate about the pros/cons of dynamic sql versus procs and sql injection and paramterised queries etc. I just want to make this report query perform in the best way possible which means using the correct indexes when possible and cacheing the plan.
If a multi branched proc with IF statements and mutliple selects won't accomplish it then will sp_executesql do the job?
Thanks for any advice in advance.
August 18, 2007 at 4:16 am
nested IF statements in a proc where the query is different can cause all sorts of problems. The usual way around this is to write the sql as being dynamic that way it recompiles for each execution of the proc - I personally don't like this approach and would generally use a called proc which itself calls each if in a different proc.
It depends upon you situation, I've broken large IF clause procs into small procs for each IF with stunning performance results, in my case these were oltp databases with high levels of concurrency so performance was critical, the dynamic sql would not have been the way to go due to the high levels of recompiles that would have been generated.
It's difficult to generalise but I always break big procs into small procs for performance. sp_executesql may work, maybe, it's a case of which do you condider to be the easisest to maintain.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
August 20, 2007 at 7:05 am
Thanks for the advice.
If using sp_executesql recomplies on each call would it not be best to just keep the existing mutli if statement proc but add WITH RECOMPILE at the top. Then you have the easy maintenance of it being in one place and no string building.
Also can you explain what happens with cached execution plans and multi branched select statements (using IF)
Does a plan get cached for each select statment within the proc or for the proc as a whole?
August 21, 2007 at 12:23 am
sp_executesql does not recompile on every execution. The execution plans for adhoc SQL are also cached along with procs. The difference is that the seperate pieces of sql executed by sp_executesql get different plans and hence are more likely to have accurate plans than for a multi-branch stored proc. Also, iirc, adhoc SQL is more likely to be removed from the proc cache than stored procedures.
It's a quick-and-dirty way of doing pseudo sub-procedures.
When a multi-branch proc compiles for the first time, all the branched are compiled and optimised based on the parameters passed on that execute. This can lead to plans that are optimal for one branch, but terrible for the other. Take this simple example.
CREATE PROCEDURE ShowMeData
@KeyValue bigint
AS
If @KeyValue is null
SELECT * FROM BigTable
ELSE
SELECT * FROM BigTable WHERE Key between @KeyValue and @keyvalue + 5000
GO
Let's assume that the first execute of that is with a KeyValue of NULL, and that a NC index exists on the column Key.
The first branch will be optimised to do a straght table scan. simple enough, and in this case, it's the branch that will be executed so no worries.
The second branch, the optimiser will 'sniff' the parameter value and come to the conclusion that for a value of null, 0 rows will be returned. Hence it will very likely optimise that branch to do a NC index seek and a lookup to the cluster. That looks fine, for a parameter value of null, but runs terribly when a real parameter value is passed and 5000 rows are returned.
Very contrived example, but you get this point.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 22, 2007 at 6:40 am
I'm not sure how this will affect overall performance, but I believe it will use the same execution plan regardless. You could union the 2 tables and since the clustered indexes on are the stamp column it should be quick. Or, you could have 3 sp's a master usp_traffic_log_master that holds the if and then calls the appropriate sp based on @IsToday, usp_traffic_log or usp_traffic_log_history. This would cache the execution plan for each sp and always use the "right" plan.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
August 22, 2007 at 6:54 am
Hi thanks for the responses.
The problem with having multiple procs per SELECT is that if you also notice from my first post I want to replace all those CASE statements that are basically carrying out futher conditionals in each WHERE clause and as I understand it also prevent the correct index being used.
If I re-wrote the SELECT to only have clauses that are needed eg instead of
AdminPk = CASE @AdminPk WHEN 0 THEN AdminPk ELSE @AdminPK END
AdminPk = @AdminPk (only when adminPk is <> 0)
Then I am going to have lots and lots of these procedures each one that is specifically tailored to the parameters that have values. The top proc is going to take all the various 12 or so params and then look at those and work out which sub-proc to call depending on which params have values and which do not. Therefore to handle every single combination of parameters it means a lot of procedures which I really do not want.
Therefore the choice it seems is either between sp_executesql or building the SQL up in the application as a parameterised string by just appending the appropriate clause if it has a value and then executing it. I understand that doing it this way will also mean a query plan is cached for each statement.
August 23, 2007 at 8:19 am
If you take the SQL within the conditional statements and put them into sub-stored procedures being called by the main procedure, they will get individual cached execution plans.
August 24, 2007 at 4:32 am
I would but I don't really want hundreds of stored procs that basically all do the same thing but with different combinations of the same where clause parameters. Thats why im debating the use of dynamic sql either on the server with sp_executesql or on the application.
August 24, 2007 at 4:41 am
given a choice between those two, go for properly parameterised dynamic SQL using sp_executesql
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply