February 22, 2017 at 12:56 pm
Hi SSC,
I've got a procedure which performs maintenance on a bunch of tables. It used to be about 40 statically typed statements which did virtually the same thing (delete in batches of n until predicate = false). I've provided a pretty verbose bit of sample code, but before getting too wrapped up in that, let me state precisely my questions.
I know EXEC() is probably better in this case, but I have that pesky output variable to deal with. Yeah yeah, I can probably refactor that away, but humor me.
declare
@RID int,
@MaxRID int,
@TotalRowsDeleted int
if object_id('tempdb.dbo.#GeneralizedCleanup') is not null drop table #GeneralizedCleanup
create table #GeneralizedCleanup
(
RID int not null identity(1,1) primary key clustered,
TableName nvarchar(128) not null,
DateColumn nvarchar(128) not null,
UnitsBack int not null check (UnitsBack >= 0),
DeleteBatchSize int not null,
DateString as convert(varchar(50), dateadd(day, -UnitsBack, getdate(), 120))
)
-- Insert tables and parameters which follow a format simple enough they can be deleted with no joins by simple date offset.
insert into #GeneralizedCleanup
(
TableName,
DateColumn,
UnitsBack,
DeleteBatchSize
)
values
('TableA', 'InsertDate', 90, 5000),
('TableB', 'DateStamp', 90, 5000),
('TableC', 'DateStamp', 90, 5000)
select
@RID = 1,
@MaxRID = @@Rowcount
while @RID <= @MaxRID
begin
select
@sql = '
declare @rc int = 1
select @totalRowsDeleted = 0
while @rc > 0
begin
delete top (' + cast(DeleteBatchSize as varchar(30)) + ')
from alerts.dbo.' + quotename(TableName) + '
where ' + quotename(DateColumn) + ' < ' + DateString + '
select
@rc = @@rowcount,
@TotalRowsDeleted += @rc
end
option (recompile)
',
@params = '
@now datetime,
@TotalRowsDeleted int output'
from #GeneralizedCleanup
where RID = @RID
exec sp_executesql
@sql,
@params,
@now,
@TotalRowsDeleted output
/* Do some logging with @TotalRowsDeleted */
select @RID += 1
end
February 23, 2017 at 8:02 am
You're going to want the RECOMPILE hint to be after the WHERE clause as defined in the documentation.
sp_executesql will attempt to reuse execution plans. Most of the time, this is a very good thing because the compile process is costly, so reusing plans from cache is to your advantage.
If you are going to have lots of single use plans, make sure you enable Optimize For Ad Hoc on your database.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 23, 2017 at 8:53 am
Grant Fritchey - Thursday, February 23, 2017 8:02 AMYou're going to want the RECOMPILE hint to be after the WHERE clause as defined in the documentation.sp_executesql will attempt to reuse execution plans. Most of the time, this is a very good thing because the compile process is costly, so reusing plans from cache is to your advantage.
If you are going to have lots of single use plans, make sure you enable Optimize For Ad Hoc on your database.
So to my original questions, here's what I understand your response to be:
p.s again, I'm intentionally focusing on sp_executesql and recompilation here. I ended up using exec() in my actual code, but I'm trying to understand this specific case on a firm conceptual foundation.
February 23, 2017 at 10:00 am
1. If you put OPTION (RECOMPILE) in your sp_executesql command, the statement will be compiled but the plan won't be stored in cache.
2. You only have one statement in each sp_executesql command, don't you? I suppose you could have more, but I've never tried it.
3. No. Your statement isn't stored in cache at all.
John
February 23, 2017 at 10:05 am
John Mitchell-245523 - Thursday, February 23, 2017 10:00 AM1. If you put OPTION (RECOMPILE) in your sp_executesql command, the statement will be compiled but the plan won't be stored in cache.
2. You only have one statement in each sp_executesql command, don't you? I suppose you could have more, but I've never tried it.
3. No. Your statement isn't stored in cache at all.John
You're correct, in my sample code, I didn't supply multiple statements at once, although in my question, I alluded to the possibility of such.
Thanks for your answers, the picture is crystallizing between yours and Grant's responses.
February 23, 2017 at 11:34 am
John has it nailed. If you put in a RECOMPILE hint, nothing gets cached because there's no need since each execution gets a new compilation.
Why would Optimize for Ad Hoc not be applicable? Except for discussion around the RECOMPILE hint & caching, it sounds very applicable. You seem to be building a very ad hoc focused process here.
Also, exec does not allow for parameterized queries which introduces the possibility of SQL Injection. Be sure you're avoiding that issue.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 23, 2017 at 12:10 pm
Grant Fritchey - Thursday, February 23, 2017 11:34 AMWhy would Optimize for Ad Hoc not be applicable? Except for discussion around the RECOMPILE hint & caching, it sounds very applicable. You seem to be building a very ad hoc focused process here.Also, exec does not allow for parameterized queries which introduces the possibility of SQL Injection. Be sure you're avoiding that issue.
Ok, backstory time then. This database does all sorts of stuff. Realtime stuff, web facing stuff, ETL stuff, etc. The procedure I was handed performs cleanup on many of these tables once a day. The procedure has about 50 statements, all of the form
while @@rowcount > 0
delete top ( <rows> )
from <table>
where <date> < <comparisonDate>
(There's slightly more to it than that, but not enough to warrant including above)
Since this pattern is re-used over and over with the only difference being essentially the table name, the column to check agains and the rows to delete per batch, I opted to do this all in a configurable loop rather than have a procedure 2000 lines long.
As to why Optimize for Ad Hoc wouldn't work, admittedly I don't know much about it, but it looks like it changes how plans work at the database level. I'm going out on a limb here, but I'm betting good money our DBAs don't want to turn this trace flag on just for the sake of this cleanup job. Plus, just at an intuitive level, the vast majority of the work being done on the database is NOT ad-hoc.
As to vulnerability to injection, it's a non-issue because the configuration of the DSQL string is statically typed at the top of the procedure. The only way to inject malicious code would be to be able to alter the stored procedure. And if a black-hat can do that, we have bigger things to worry about.
With all this, EXEC() made perfect sense. Targeted, as to not affect anything else on the database, planless, so as to skip the (albeit small) cost of creating then invalidating cache plans, and injection safe because it's configured by the developer.
February 24, 2017 at 1:57 am
Xedni - Thursday, February 23, 2017 12:10 PMAs to why Optimize for Ad Hoc wouldn't work, admittedly I don't know much about it, but it looks like it changes how plans work at the database level. I'm going out on a limb here, but I'm betting good money our DBAs don't want to turn this trace flag on just for the sake of this cleanup job. Plus, just at an intuitive level, the vast majority of the work being done on the database is NOT ad-hoc.
Optimize for ad hoc just means that the plan isn't cached the first time it's executed. Only the query hash is stored, and if the query is run a second time then the plan is compiled again and stored in cache for subsequent executions. The first thing is to ask the DBAs why they don't want to turn this setting on. The second thing is to test whether your intuition about ad hoc work is correct. Run the query below, which I have shamelessly borrowed from Kirmberly Tripp's pages. What you're looking for is the number of single-use plans as a proportion of the overall activity. I'm not going to tell you what I think is a high proportion - you may need to do some testing. Incidentally, optimize for ad hoc isn't a trace flag; it's a configuration setting that you change with sp_configure.SELECT
objtype AS [CacheType]
, COUNT_BIG(*) AS [Total Plans]
, SUM(CAST(size_in_bytes AS decimal(18, 2))) / 1024 / 1024 AS [Total MBs]
, AVG(usecounts) AS [Avg Use Count]
, SUM(CAST((CASE
WHEN usecounts = 1 THEN size_in_bytes
ELSE 0
END) AS decimal(18, 2))) / 1024 / 1024 AS [Total MBs – USE Count 1]
, SUM(CASE
WHEN usecounts = 1 THEN 1
ELSE 0
END) AS [Total Plans – USE Count 1]
FROM sys.dm_exec_cached_plans
GROUP BY objtype
ORDER BY [Total MBs – USE Count 1] DESC;
With all this, EXEC() made perfect sense. Targeted, as to not affect anything else on the database, planless, so as to skip the (albeit small) cost of creating then invalidating cache plans, and injection safe because it's configured by the developer.
No, it's not planless, it's just that the plan isn't saved (and then only if you use OPTION (RECOMPILE) or it's the first execution under execute for ad hoc). All DML statements have an execution plan. You'll pay the cost of creating the plan each time your statement uses OPTION (RECOMPILE) or doesn't already have a plan in cache. If your clustered index is on the date column then the execution plan is likely to be very stable and I would strongly advise you to use sp_executesql with the date as a parameter and without OPTION (RECOMPILE).
John
February 24, 2017 at 2:23 am
Xedni - Thursday, February 23, 2017 12:10 PMWith all this, EXEC() made perfect sense. Targeted, as to not affect anything else on the database, planless, so as to skip the (albeit small) cost of creating then invalidating cache plans, and injection safe because it's configured by the developer.
The only things that are 'planless' are DML statements that can only execute one way (eg CREATE TABLE). All DML has to have a plan generated in order to execute. Since SQL 2005, ad-hoc SQL and dynamic queries are treated identically to stored procedures in terms of their plans being cached.
I'm not sure why you're looking to add recompile to that process. There's nothing in it that requires a fresh plan on every execution of the loop. Generating a plan the first time and reusing it on subsequent trips through the loop looks like the more optimal approach.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 24, 2017 at 9:00 am
GilaMonster - Friday, February 24, 2017 2:23 AMI'm not sure why you're looking to add recompile to that process. There's nothing in it that requires a fresh plan on every execution of the loop. Generating a plan the first time and reusing it on subsequent trips through the loop looks like the more optimal approach.
Thank you, I think that was the nugget of truth I was hoping to find in this thread. Looks like I didn't fully understand how sp_executesql perfomed its caching. I thought it was sort of the entire string whose contents got cached. Now that I understand that a single sp_executesql can contain a save bunch of individual plans to apply based on current statement, it's clear there's no need to force a new plan. It will simply have 50 (ish) unique plans for this sp_executesql statement, and depending on which table is being worked on, it will use the appropriate saved statement plan.
Sound like I'm understanding this right?
February 24, 2017 at 9:04 am
Xedni - Friday, February 24, 2017 9:00 AMGilaMonster - Friday, February 24, 2017 2:23 AMI'm not sure why you're looking to add recompile to that process. There's nothing in it that requires a fresh plan on every execution of the loop. Generating a plan the first time and reusing it on subsequent trips through the loop looks like the more optimal approach.Thank you, I think that was the nugget of truth I was hoping to find in this thread. Looks like I didn't fully understand how sp_executesql perfomed its caching. I thought it was sort of the entire string whose contents got cached. Now that I understand that a single sp_executesql can contain a save bunch of individual plans to apply based on current statement, it's clear there's no need to force a new plan. It will simply have 50 (ish) unique plans for this sp_executesql statement, and depending on which table is being worked on, it will use the appropriate saved statement plan.
Sound like I'm understanding this right?
Yep. Basically.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply