January 11, 2016 at 1:28 pm
Hi SQL Server community, I'm hoping you can shed your light on some unexpected behavior I'm seeing.
I have a pure OLTP system, which I'm trying to further optimize. For this I use a set workload which runs a large number of big transactions (mimicking my production system).
Inside these transactions a LOT of updates and inserts take place.
My idea was to replace all these update (and inserts) statements by stored procedures to optimize plan re-use and speed things up even more. I've further optimized these stored procedures to use local variables, join with a static numbers table (for updating multiple rows at the time) etc.
However, when running the exact same load (resetting buffers, updating statistics, same conditions), the code without stored procedures performs more batches/min and thus finishes quicker.
Anything obvious I might be missing? Anyone who has similar experience and could offer some insight of where my thinking is wrong?
As a side note, even with these stored procedures I am still suffering from reported bloated (ad hoc) cache.
Appreciate any feedback or ideas!
January 11, 2016 at 1:36 pm
I have a pure OLTP system, which I'm trying to further optimize. For this I use a set workload which runs a large number of big transactions (mimicking my production system).
Can you further define "big transaction"?
My idea was to replace all these update (and inserts) statements by stored procedures to optimize plan re-use and speed things up even more. I've further optimized these stored procedures to use local variables, join with a static numbers table (for updating multiple rows at the time) etc.
Local variables and a join with static numbers table may not mean better performance.
However, when running the exact same load (resetting buffers, updating statistics, same conditions), the code without stored procedures performs more batches/min and thus finishes quicker.
How exactly are you resetting buffers and updating statistics??
Can you post some of your ad-hoc code, and the corresponding stored procedure that was created? It's kind of tough to make some sort of analysis and recommendation without seeing the actual code.
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
January 11, 2016 at 1:52 pm
Hi Sir, talk about quick turnaround. I was actually already editing my post, please allow me to give some more details:
Can you further define "big transaction"?
==> 500 to a 1000 updates + 10s of inserts per transaction. I understand a transaction should be as small as possible, in this case the application code has processed all changes in memory and needs to (at set times) write all changes to the database.
Local variables and a join with static numbers table may not mean better performance.
==> The local variables are what I found (online best practice)to be a good way to (try and) avoid parameter sniffing, plan bloat. But indeed doesn't seem to be working as bloated cache still gets reported.
==> The static numbers table is in case multiple rows need to be updated. At the moment this is done by way of:
UPDATE ... SET ... WHERE id = 1 OR id = 2 or ... id = x
How exactly are you resetting buffers and updating statistics??
==> I restart the service and I run an sp_updatestats.
Can you post some of your ad-hoc code, and the corresponding stored procedure that was created? It's kind of tough to make some sort of analysis and recommendation without seeing the actual code.[/quote]
==> Sure, all these update SPs look like this (for a single row):
ALTER PROCEDURE [dbo].[sp_ClientNotes_update]
@id uniqueidentifier,
@ordering smallint = NULL,
@title nvarchar(20) = NULL,
@content text = NULL
AS
BEGIN
SET NOCOUNT ON;
DECLARE @local_ordering smallint = null;
DECLARE @local_title nvarchar(20);
DECLARE @local_content text;
SET @local_ordering = @ordering;
SET @local_title = @title;
SET @local_content = @content;
UPDATE tbl_ClientNotes
SET ordering=ISNULL(@local_ordering,ordering),
title=ISNULL(@local_title,title),
content=ISNULL(@local_content, content)
WHERE id=@id
END
Those for multiple rows looks similar but JOIN with a split function (working with the static generated numbers tables) which passes a string of comma seperated id's and returns a table. I'm not @ work at the moment so typing this example from memory. 🙂
I'm more than willing to change or drop any of these SP designs, I'm merely trying to test and find what works best given my environment. Much appreciate your time and feedback!
January 11, 2016 at 2:35 pm
Stijn Vervaet (1/11/2016)
Hi Sir, talk about quick turnaround. I was actually already editing my post, please allow me to give some more details:
Thank you.
The local variables are what I found (online best practice)to be a good way to (try and) avoid parameter sniffing, plan bloat. But indeed doesn't seem to be working as bloated cache still gets reported.
In actuality, you want parameter sniffing.
Where this is a problem is when the parameters being used change.
As an example, three parameters.
One time the query is executed with parameter 1 containing a value, parameters 2 and 3 are null. The next time, 1 is null, 2 has a value, and 3 is null. And so forth. The first time the query is run, the execution plan is created. Subsequent runs with different parameters may not use the most optimal plan.
You may have created a "Catch all Query", described in detail by Gail Shaw.
http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/
Take some time and read this article. See if it corresponds to what you are trying to do, and if so, try to apply some of the techniques described in this article.
From the query you provided below, this is exactly the problem.
How exactly are you resetting buffers and updating statistics??
==> I restart the service and I run an sp_updatestats.
Restarting SQL is probably not a good method. You can perform these commands:
DBCC FREEPROCCACHE
and
DBCC DROPCLEANBUFFERS
You also may not want to do this for every execution of an ad-hoc or a procedure. You probably only want to start from scratch if the query or proc is changed.
Can you post some of your ad-hoc code, and the corresponding stored procedure that was created? It's kind of tough to make some sort of analysis and recommendation without seeing the actual code.
==> Sure, all these update SPs look like this (for a single row):
ALTER PROCEDURE [dbo].[sp_ClientNotes_update]
@id uniqueidentifier,
@ordering smallint = NULL,
@title nvarchar(20) = NULL,
@content text = NULL
AS
BEGIN
SET NOCOUNT ON;
DECLARE @local_ordering smallint = null;
DECLARE @local_title nvarchar(20);
DECLARE @local_content text;
SET @local_ordering = @ordering;
SET @local_title = @title;
SET @local_content = @content;
UPDATE tbl_ClientNotes
SET ordering=ISNULL(@local_ordering,ordering),
title=ISNULL(@local_title,title),
content=ISNULL(@local_content, content)
WHERE id=@id
END
Assume you need to update only one column in the table. This syntax will update every column in the table.
Those for multiple rows looks similar but JOIN with a split function (working with the static generated numbers tables) which passes a string of comma seperated id's and returns a table. I'm not @ work at the moment so typing this example from memory. 🙂
If the splitter function you are using contains a loop, you may be better served to find one that doesn't.
One of the better ones is here, by Jeff Moden:
http://www.sqlservercentral.com/articles/Tally+Table/72993/
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
January 11, 2016 at 3:03 pm
Thank you Michael for your thoughts. As a sole DBA it's sometimes hard to come up wit a good solution without somebody to spar ideas back and forth with.
- On the parameter sniffing 'catch-all' query.
Good point on the catch-all query being generated. I will have to study the article you mentionned and try the dynamic SQL example out tomorrow. I'm running SQL 2012 and before trying local variables, had experimented with adding the RECOMPILE hint. Result was a hughe CPU increase and overall performance degradation. This might have been expected and maybe a sign I should get "a more powerful server". But I decided to continue searching in different directions. I want to try the Dynamic SQL generation, but something tells me that (especially since some of my tables have a LOT of columns, since this is a pure OLTP database) there will be a considerable overhead in doing so. But I will try! 🙂
- On resetting buffers and updating statistics. Agreed the DBCC commands are more convenient, but since I am on a Dev system (working with a backup from production), I figured it would not matter all that much. Sometimes I perform multiple runs of the workload. Not resetting the buffer, instead letting the (memory) cache build, and see how this performs over time. What's important for me is that I can compare the processed workload one-for-one (code with/without stored procedures). But appreciate you mentioning this.
- On the UPDATE itself:
The SP has a parameter for every column. My initial thinking was to uild the update statement in such a way that if the parameter is NULL (aka no value is passed). The field would be updated with its current value. It is now starting to dawn on me that this might be very in-efficient, especially since some tables are so large. Seems the only (??) solution for this would be to dynamically generate the sQL. But then wouldn't I be better at sticking with plain ol' update queries (generated by the application code, which generats SQL to only update those fields that have changed)? I guess I will find out.
--> Careful conclusion: there might not be an efficient way to update tables by way of SP, especially if the table has a lot of columns AND which fields need updating changes on a per-statement basis. Which would lead me back to my original issue I saw reported of cache bloat / ad hoc plans taking up a lot of space, which might be unavoidable?
I did experiment with turning on 'optimise for ad-hoc workloads' and even ALTER the DB for foreced parameterization, but this only seemed to hurt performance.
- Finally, on the split function. I don't use a loop.
I use the one described here (http://www.sommarskog.se/arrays-in-sql-2005.html#inline_split_me):
CREATE FUNCTION inline_split_me(@param nvarchar(MAX))
RETURNS TABLE AS
RETURN(SELECT ltrim(rtrim(convert(nvarchar(4000),
substring(@param, Number,
charindex(N',' COLLATE Slovenian_BIN2,
@param + convert(nvarchar(MAX), N','),
Number) -
Number)
))) AS Value
FROM Numbers
WHERE Number <= convert(int, len(@param))
AND substring(convert(nvarchar(MAX), N',') + @param, Number, 1) =
N',' COLLATE Slovenian_BIN2)
Thank you for your feedback Michael! 🙂
January 11, 2016 at 3:13 pm
First, confirm if the query plans used by individual statements in fact differ in a significant way from the plans used when contained within a stored procedure. As for root cause of why the stored procedure code is slower, it could boil down to an issue of batch operations within separate transactions spending a significant percentage of runtime duration stuck in a blocked state.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
January 11, 2016 at 3:24 pm
Hi Eric, thank you for weighing in. As I have multiple application servers running similar type transactions against one database server, there is definitely a lot of concurrency.
I do witness plenty of locking and the occasional blocking , as would be expected. Though it seems that the main bottleneck (I'm using Idera's Diagnostic Manager to help me dig in) is the #of batches VS the #of compiles (almost as much compiles as batches, few re-compiles).
This led me to investigate switching to SPs. The initial thinking being that, even with some blocking, I would at the least see similar performance with the SP equipped code as compared to the code that uses plain SQL. To my surprise this was anything but comparable.
January 12, 2016 at 7:56 am
I second Eric's suggestion. Can you post a comparison execution plan?
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
January 12, 2016 at 9:11 am
Hello gentlemen,
I investigated the “SP with Dynamic SQL” for a couple of hours this morning and came to the conclusion to move away from this route in the end.
I would have to build a stored procedure generator, which would generate a stored procedure to dynamically build an update query for every possible table and field(type).
These SPs would have to have a parameter for all fields in the table. Defaulting to NULL.
There would be overhead in generating this Dynamic SQL (evaluating field types and acting accordingly, evaluating if parameters are NULL etc.).
Trying to make use of the sp_executesql also proved difficult as I can’t figure out how to pass the list of parameter values (the first 2 parameters of sp_executesql are strings, so I can generate those, but the values have to be the (comma separated) actual values. Putting only those values there from the initial parameters that are not null seems to be not doable.
Even if I were to figure out some magic, I doubt that after all this work – and with all this extra code in the SP – this would be faster than a simple straightforward update statement generated by the application code.
So back to square 1 for me ?.
I’m looking at logically & physically splitting the data and files to avoid contention and see if I can get the database to process more batches quicker.
If you would have any ideas on what to do to speed things up “once all else seems OK“ and CPU is not maxed out, I’m all ears!
Thanks again for your time and feedback!
January 12, 2016 at 9:56 am
If you are splitting a VARCHAR(MAX) or NVARCHAR(MAX) string, you should look at using a CLR splitter instead of a T-SQL splitter.
Please check out this article for information: http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Also, in one of your earlier posts if looked like you are using the TEXT data type. That, along with NTEXT and IMAGE, have been deprecated. Use the MAX data types.
January 12, 2016 at 10:05 am
Stijn Vervaet (1/11/2016)
I want to try the Dynamic SQL generation, but something tells me that (especially since some of my tables have a LOT of columns, since this is a pure OLTP database) there will be a considerable overhead in doing so
Not sure why you would think that.
Maybe you think it will create a plan per execution?
Will happen if you do exec(@SQL) and hard code scalar values into the query. (are you using sp_executesql and passing parameters to this proc?)
Gail's article lays this out completely. (Praise Gail, her mercies endure forever, Amen)
Do you have a test case that proves it gets worse?
Is your query nonsargeable in both dsql and originally? String concatenation is nothing compared to a table or index scan.
Perhaps you now have a sargeable query with DSQL, but now have missing indexes?
I have only had disproportionately heavily far superior results using dsql compared to catchall queries.
Speed improvements were in the area of 99.9%, enough to get me a reasonable bonus as well as reputational gains in the team.
January 12, 2016 at 10:36 am
Stijn Vervaet (1/12/2016)
Hello gentlemen,I investigated the “SP with Dynamic SQL” for a couple of hours this morning and came to the conclusion to move away from this route in the end.
I would have to build a stored procedure generator, which would generate a stored procedure to dynamically build an update query for every possible table and field(type).
These SPs would have to have a parameter for all fields in the table. Defaulting to NULL.
There would be overhead in generating this Dynamic SQL (evaluating field types and acting accordingly, evaluating if parameters are NULL etc.).
Trying to make use of the sp_executesql also proved difficult as I can’t figure out how to pass the list of parameter values (the first 2 parameters of sp_executesql are strings, so I can generate those, but the values have to be the (comma separated) actual values. Putting only those values there from the initial parameters that are not null seems to be not doable.
Even if I were to figure out some magic, I doubt that after all this work – and with all this extra code in the SP – this would be faster than a simple straightforward update statement generated by the application code.
So back to square 1 for me ?.
I’m looking at logically & physically splitting the data and files to avoid contention and see if I can get the database to process more batches quicker.
If you would have any ideas on what to do to speed things up “once all else seems OK“ and CPU is not maxed out, I’m all ears!
Thanks again for your time and feedback!
Can you show us what you have attempted? What you are describing sounds very inefficient.
Here is a very simple example of dynamic SQL that does not need to be created on the fly.
CREATE TABLE dbo.Test_This
(
Test_This_ID int IDENTITY,
Column_1 varchar(50),
Column_2 int,
Column_3 char(1),
Column_4 varchar(300),
Column_5 varchar(10)
)
GO
ALTER PROCEDURE dbo.Update_Test_This
@Test_This_ID int,
@Column_1 varchar(50) = NULL,
@Column_2 int = NULL,
@Column_3 char(1) = NULL,
@Column_4 varchar(300) = NULL,
@Column_5 varchar(10) = NULL
AS
DECLARE @SQL nvarchar(max)
DECLARE @SQL_Where nvarchar(max) = 'WHERE Test_This_ID = @Test_This_ID'
SET @SQL = 'UPDATE Test_This' + CHAR(10) + 'SET '
IF @Column_1 IS NOT NULL Begin
SET @SQL += 'Column_1 = @Column_1,'
END
IF @Column_2 IS NOT NULL Begin
SET @SQL += 'Column_2 = @Column_2,'
END
IF @Column_3 IS NOT NULL Begin
SET @SQL += 'Column_3 = @Column_3,'
END
IF @Column_4 IS NOT NULL Begin
SET @SQL += 'Column_4 = @Column_4,'
END
IF @Column_5 IS NOT NULL Begin
SET @SQL += 'Column_5 = @Column_5,'
END
SET @SQL = LEFT(@SQL, LEN(@SQL) - 1)
SET @SQL += CHAR(10) + @SQL_Where
PRINT @SQL
EXEC sp_executesql @stmt = @SQL,
@parms = N'@Test_This_ID int,
@Column_1 varchar(50),
@Column_2 int,
@Column_3 char(1),
@Column_4 varchar(300),
@Column_5 varchar(10)',
@Test_This_ID = @Test_This_ID,
@Column_1 = @Column_1,
@Column_2 = @Column_2,
@Column_3 = @Column_3,
@Column_4 = @Column_4,
@Column_5 = @Column_5
GO
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
January 12, 2016 at 11:43 am
Thanks all who contributed.
I did continue altering my SP-generator (I don't want to create a SP by hand for each table in the database), this now generates the following at the end below (which I think is very similar to what you just posted Michael). This is one of my smallest tables, I have tables with 50 - 100 columns (again why I needed a script to generate these SPs).
I finished my generator and created the stored procedures for the tables used in my test workload.
Unfortunately I can see I am hitting at best 3000 batches / sec , whereas without SP I can hit > 3500.
If you see any improvements I could make (I'll compare to your example shortly Michael) to the DSQL, I'm all ears.
CREATE PROCEDURE gnr8_upd8_club_sequence_records
@id bigint,
@club_overall_match_stats bigint = NULL,
@type tinyint = NULL,
@amount int = NULL,
@start_date datetime2 = NULL,
@end_date datetime2 = NULL
AS
BEGIN
SET NOCOUNT ON
DECLARE @SQL_Cmd nvarchar(Max);
DECLARE @SQL_Params nvarchar(Max);
SET @SQL_Cmd = N'UPDATE dbo.fixtures WITH(REPEATABLEREAD) SET ';
IF(@club_overall_match_stats IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'club_overall_match_stats = @_club_overall_match_stats,';
IF(@type IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'type = @_type,';
IF(@amount IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'amount = @_amount,';
IF(@start_date IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'start_date = @_start_date,';
IF(@end_date IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'end_date = @_end_date,';
SET @SQL_Cmd = SUBSTRING(@SQL_Cmd,1,LEN(@SQL_Cmd)-1) + N' WHERE id = @_id';
EXEC sp_executesql @SQL_Cmd,
N'@_id bigint,@_club_overall_match_stats bigint,@_type tinyint,@_amount int,@_start_date datetime2,@_end_date datetime2',
@_id = @id,@_club_overall_match_stats = @club_overall_match_stats,@_type = @type,@_amount = @amount,@_start_date = @start_date,@_end_date = @end_date
END
GO
edit, please see the workload comparison below:
January 12, 2016 at 11:58 am
Again, can we see a comparison of the two actual execution plans? That would help much more than this chart from Idera.
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
January 12, 2016 at 12:44 pm
Apologies Michael, I was trying to get my DSQL-SPs generator to work so I could test it and forgot to provide you with the execution plans. I will provide you with these tomorrow when I'm back in the office.
Since there are so many different update statements happening in the transaction, does it matter if I pick a large or a short one?
Appreciate everyone's input thus far and willingness to even read about my issue.
Viewing 15 posts - 1 through 15 (of 56 total)
You must be logged in to reply to this topic. Login to reply