January 13, 2016 at 5:33 pm
Hello Michael
Quick questions. Have you ever partitioned a table? If so, can you provide some actual facts as to what the performance gains you realized?
Yes , I did this on one of biggest table ( millions of rows) , By create a table parition along with filegroup distrubuted on different DISK , the system perform well .
Do you have your servers set to optimize for ad-hoc workloads? If so, why? Do you have any facts to back up that it will actually help the performance of the OP's system?
Do you have MAXDOP set to 1? If so, why? What made you make this change to your system?
Most of my servers are set to ad-hoc workloads , regarding MAXDOP yes I used to change this value based on best practice
https://support.microsoft.com/en-us/kb/2806535
and I read the post before I comment on it.
Regards,
Abdullah. Zarour
January 13, 2016 at 6:19 pm
abdullah.zarour (1/13/2016)
Hello MichaelQuick questions. Have you ever partitioned a table? If so, can you provide some actual facts as to what the performance gains you realized?
Yes , I did this on one of biggest table ( millions of rows) , By create a table parition along with filegroup distrubuted on different DISK , the system perform well .
Do you have your servers set to optimize for ad-hoc workloads? If so, why? Do you have any facts to back up that it will actually help the performance of the OP's system?
Do you have MAXDOP set to 1? If so, why? What made you make this change to your system?
Most of my servers are set to ad-hoc workloads , regarding MAXDOP yes I used to change this value based on best practice
https://support.microsoft.com/en-us/kb/2806535
and I read the post before I comment on it.
Regards,
Abdullah. Zarour
When you say your system performs well, did you look at statistics when you run a query that reaches back into the (lets say) archive partition ... and then compare this to the same query on the unpartitioned version of the table? Here I assume that indexes and statistics were equal and up to date on either side? I am curious about this performance gain.
I think before we get into hardware configurations, it is advisable to first try to tune the query. AS myself and others have mentioned, we suspect we can do this through set based logic. We just need a little more information to go on. Please see this link on asking for help in a way that helps us to help you.
http://www.sqlservercentral.com/articles/Best+Practices/61537/
----------------------------------------------------
January 13, 2016 at 6:29 pm
abdullah.zarour (1/13/2016)
Hello MichaelQuick questions. Have you ever partitioned a table? If so, can you provide some actual facts as to what the performance gains you realized?
Yes , I did this on one of biggest table ( millions of rows) , By create a table parition along with filegroup distrubuted on different DISK , the system perform well .
Do you have your servers set to optimize for ad-hoc workloads? If so, why? Do you have any facts to back up that it will actually help the performance of the OP's system?
Do you have MAXDOP set to 1? If so, why? What made you make this change to your system?
Most of my servers are set to ad-hoc workloads , regarding MAXDOP yes I used to change this value based on best practice
https://support.microsoft.com/en-us/kb/2806535
and I read the post before I comment on it.
Regards,
Abdullah. Zarour
Well, I stand corrected. If partitioning did provide a performance boost for you, then great. But partitioning is NOT designed to boost performance. There are so many threads, with proofs and actual numbers, on this site alone that have showed that partitioning actually can inhibit performance.
As for ad-hoc workloads, this indeed can provide benefit if the metrics and workload merit so.
Where in this article does it say set maxdop to 1?
In case you haven't figured it out, one of my rants is when recommendations are blindly thrown out as "the way to do things" without any facts or explanations.
Please don't tell me you recommend nolock!!!:-D:-D:-D
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
January 13, 2016 at 10:47 pm
Thanks for clarification and sharing the best practice link , about Max DOP set to 1 , I didn't mention this value at all for any one , the only exception to use Max DOP set to 1 is with SharePoint Database server . again thank you .
Regards,
Mr. Zarour
January 14, 2016 at 3:11 am
Hi Sergiy, appreciate you joining the discussion.
Unfortunately your idea isn't fitting for me because:
- if the update runs, it would update all fields of the table (in case of NULL, update with the value in there), which causes considerable overhead I've measured
- as MadAdmin mentioned, in some special cases NULL is an acceptable value, and this SP would not allow me to update a value to null
thank you though!
January 14, 2016 at 4:18 am
@ Michael,
A very good observation that I "might not have a problem". This is fortunately true at the moment. I am trying to anticipate our application scaling up users, by taking what we know is some of our "heaviest workload" at this point, and testing it with different scenario's / configurations to see if performance improves. So yes trial and error would describe it well 🙂
I can't post the code, for which I apologise, but I can make an effort to explain what I am doing.
All code talking to the DB is C++, most of the data manipulation happens in memory on several application servers, only at set times data gets pushed to the database (which ideally would happen as fast and low-impact possible).
So at set times the DB gets "bombarded" mainly by update queries (as the field values that need to be set were already tracked and calculated in memory on the application server(s)). All UPDATES are straightforward "UPDATE TABLE X SET FIELD(S)=value(s) where id=Y" (or =YY OR = YYY ... in case of some multiple records needing updating with the same values.
Since these UPDATE statements are so simple in nature, there's very little to do to optimze them. So I was at least pleasently surprised that i had 10 - 15 % increase by using SPS to perform updates.
A lot of these updates are very rapid fire to a couple of identical tables. having split up the database in a number of files, I can observe by the latency stats before - and after - running the workload , which files are (physically) under a lot of pressure. So i thought maybe investigating logical partitioning tables in those affected files identified could speed up things. Another open question would be locking. As locking is normal within a DB I would be trying to minimize it, spreading out the data might help with this as well?
As for Set based updates. I might need to research this, but am guessing that you mean to update the records in a way like:
UPDATE tables SET VALUES FROM a SELECT or CTE etc. At the moment by way of desing from the code I don't think this is a valid solution. Since all data is first updated in memory (already there, doesn't need to be SELECTed form the DB first) and once done being processed it gets written to the DB.
Again, thanks to all who keep this topic alive and have provided valuable input! 🙂
January 14, 2016 at 6:04 am
Stijn977 (1/14/2016)
All code talking to the DB is C++, most of the data manipulation happens in memory on several application servers, only at set times data gets pushed to the database (which ideally would happen as fast and low-impact possible).So at set times the DB gets "bombarded" mainly by update queries (as the field values that need to be set were already tracked and calculated in memory on the application server(s)). All UPDATES are straightforward "UPDATE TABLE X SET FIELD(S)=value(s) where id=Y" (or =YY OR = YYY ... in case of some multiple records needing updating with the same values.
Since these UPDATE statements are so simple in nature, there's very little to do to optimze them. So I was at least pleasently surprised that i had 10 - 15 % increase by using SPS to perform updates.
Are those updates all separate transactions from app point of view or they comprise a single business transaction?
Currently it sounds like DB is just a backup or something, which is rarely read by application servers. Being it the case consider app doing inserts (which should be faster) into some buffer table and this buffer table then is processed by a SQL procedure to do real updates so the application servers aren't blocked.
January 14, 2016 at 6:21 am
Hi Serg!
Those groups of INSERTS & UPDATES are one business transaction.
I wouldn't say the DB is a backup facility. When I say stuff gets updated/inserted every now and then, this is from the application perspective and can still be every minute or multiple times a minute (which is slow compared to the many in memory data changes that happen in between database writes/updates).
But thank you for your idea of a buffer table. It is definitely something I will consider testing with once I get some coder support.
January 14, 2016 at 7:36 am
Actually, it does sound as if this database is only a repository for data.
If I understand the architecture correctly:
1. The application reads data from the database, or someplace.
2. The application manipulates this data in code.
3. The application writes this modified data to the database.
Step 2 is where my brain starts hurting. I'm making a guess (and it certainly is a guess because of the lack of information) that that majority of this manipulation can be replaced with T-SQL code.
If that's true, that is likely the place where the most gains can be made in performance.
You're approach seems to be haphazard at best.
The things that i am most curious about is the structure of the data files (you said you had multiple files), the level of normalization of the database, and the specifics about the data in the tables, what is pulled into the application, and the exact nature of these manipulations.
As serg-52 suggested, assuming that my guesses at the architecture are close, save the manipulated data into a table. Then, a set based update procedure is called to "finalize" the data into the tables that need to be updated.
Does your company have room in the budget for a consultant? Becasue that's my first recommendation to you. Get someone that can do a complete analysis, from both an architecture as well as set up.
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
January 14, 2016 at 10:48 am
Hi Michael, appreciate the points you raise! I haven't had time yet to look into the practicality of 'temp table' based updates in my case, do you mind if I ask you a couple of questions that pop into my head you might have a clear idea about?
- If you have multiple application servers (which scale easier - just add more) than the database server. These would all be leveraging tempdb a lot more. Isn't there the potential of creating a bottleneck in tempdb?
- Back to the point where NULL is (in special cases) a valid value, how would you solve the below with a temp table that allows for the following updates:
UPDATE X SET field1=1000, field2='test 1000' WHERE id=1001
UPDATE X SET field1=1002, field3='2012-07-07' WHERE id=1002
UPDATE X SET field4=7, field2=NULL WHERE id=1003
The temp table for this example would then be
id INT,
field1 INT ALLOW NULL,
field2 varchar(20) ALLOW NULL,
field3 DATE ALLOW NULL,
field4 INT ALLOW NULL,
field5 varchar(8) ALLOW NULL
With the 3 update staments replaced by inserts:
idField1Field2Field3Field4Field5
1001,1000,'test 1000',null,null,null
1002,1002,null,'2012-07-07',null, null
1003,null,null,null,7,null
Now if a field does not need updating, I guess the value in the table would stay null.
How to get around that in some cases (UPDATE X SET field4=7, field2=NULL WHERE id=1003) , you DO want to update to NULL, but disregard updating the other fields that are null? similar to the issue you have when using a SP that uses DEFAULT null as the default value for optional parameters.
I use NULL in this example but I guess it could go for any default value (say if you use default value = 0 , in some cases you might want to update to 0)?
Thanks!
January 14, 2016 at 11:06 am
Now if a field does not need updating, I guess the value in the table would stay null.
How to get around that in some cases (UPDATE X SET field4=7, field2=NULL WHERE id=1003) , you DO want to update to NULL, but disregard updating the other fields that are null?
If you are writing a new record the default values take over. If you are overwriting a record it sounds to me like you are wanting to pass a value for every column even if one of them is being updated, and wanting to ensure the rest are set to NULL.
There is a lot I can say here. I dont think this is a warehouse situation and you have an application writing directly to your database I take it? I also take it there is no history kept at the table when an update occurs? Please have a look at the link I posted earlier to better assist you . Without knowing more, the best guess I can give you is that in your proc you should play it safe and define a value for all column , even if NULL.
----------------------------------------------------
January 14, 2016 at 11:08 am
If this is indeed the case, I am not sure why updating one field means blanking out the other, but as I mentioned I am probably missing a thing or two in important information here.
----------------------------------------------------
January 14, 2016 at 1:02 pm
in this case the application code has processed all changes in memory and needs to (at set times) write all changes to the database.
I'm surprised no one hasn't offered this solution. Create a user-defined type for each table updated. Pass those tables as readonly table valued parameters into a proc that has MERGE statements for each user-defined type. It'll be screaming fast because no latency between client and server.
January 14, 2016 at 9:04 pm
Stijn977 (1/14/2016)
Hi Sergiy, appreciate you joining the discussion.Unfortunately your idea isn't fitting for me because:
- if the update runs, it would update all fields of the table (in case of NULL, update with the value in there), which causes considerable overhead I've measured
- as MadAdmin mentioned, in some special cases NULL is an acceptable value, and this SP would not allow me to update a value to null
thank you though!
The code example you posted as "finalised" excludes all NULL variables from the UPDATE altogether:
SET @SQL_Cmd = N'UPDATE dbo.fixtures WITH(REPEATABLEREAD) SET ';
IF(@club_overall_match_stats IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'club_overall_match_stats = @_club_overall_match_stats,';
IF(@type IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'type = @_type,';
IF(@amount IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'amount = @_amount,';
IF(@start_date IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'start_date = @_start_date,';
IF(@end_date IS NOT NULL) SET @SQL_Cmd = @SQL_cmd + N'end_date = @_end_date,';
SET @SQL_Cmd = SUBSTRING(@SQL_Cmd,1,LEN(@SQL_Cmd)-1) + N' WHERE id = @_id';
EXEC sp_executesql @SQL_Cmd,
N'@_id bigint,@_club_overall_match_stats bigint,@_type tinyint,@_amount int,@_start_date datetime2,@_end_date datetime2',
@_id = @id,@_club_overall_match_stats = @club_overall_match_stats,@_type = @type,@_amount = @amount,@_start_date = @start_date,@_end_date = @end_date
I just showed that what you've been doing does not require D-SQL.
Special conditions can be easily implemented with CASE statement instead of ISNULL.
About updating all fields.
SQL Server updates pages.
Every row must fit to a page. Every field of it.
There could be several rows residing in one page, but a single row cannot be spread across several pages.
That's where the limitation of 8060 bytes per row comes from.
So, no matter how many fields you include into a single row UPDATE (WHERE Id = @Id) it will overwrite one page. No more, no less.
I don't see any overhead here.
_____________
Code for TallyGenerator
January 14, 2016 at 9:07 pm
Apart from that.
How do you deal with concurrent updates?
If 2 or more users edit the same record in their independent instances of C++ application - which version of the record takes precedence?
_____________
Code for TallyGenerator
Viewing 15 posts - 31 through 45 (of 56 total)
You must be logged in to reply to this topic. Login to reply