March 9, 2015 at 1:36 pm
We got 10 SQL servers in production. Basically Data is sharded horizontally using custom coding into these servers. (System was architect-ed almost 10 years back and was build using SQL Server Standard edition. Basically couldn't use in build partitioning). In all these servers we have lots of monitoring to alert us on waits, long running queries, replication delays, backup delays, long running jobs, and things like that. We rarely see hiccups on these servers.
Now I am trying to figure out how much free resources each server have, how long it is been free, and what resources are available. The big idea is can I go from 10 servers to 8.
I heard that HammerDB can be used to Stress test server. Have you heard about other ways to monitor how much idle time a server has and over period of time if the server is been constantly idle.
This is almost reverse of the performance tuning problem.
Thanks for any help. I understand the things we need to consider for reducing the servers (like modifying apps that use them and things like that).
Thanks,
Nachi
March 9, 2015 at 7:02 pm
NachiM (3/9/2015)
We got 10 SQL servers in production. Basically Data is sharded horizontally using custom coding into these servers. (System was architect-ed almost 10 years back and was build using SQL Server Standard edition. Basically couldn't use in build partitioning). In all these servers we have lots of monitoring to alert us on waits, long running queries, replication delays, backup delays, long running jobs, and things like that. We rarely see hiccups on these servers.
Not quite right. Partitioning is absolutely possible in the standard edition and I almost prefer the method (Partitioned Views) in Standard Edition to the method in the Enterprise Edition (Partitioned Tables).
Now I am trying to figure out how much free resources each server have, how long it is been free, and what resources are available. The big idea is can I go from 10 servers to 8.
I heard that HammerDB can be used to Stress test server. Have you heard about other ways to monitor how much idle time a server has and over period of time if the server is been constantly idle.
This is almost reverse of the performance tuning problem.
Thanks for any help. I understand the things we need to consider for reducing the servers (like modifying apps that use them and things like that).
To be honest, if it's the 64 bit version of SQL Server, then (IMHO) you should be able to combine all those resources into just 1 server fairly easily even if it is "just" the Standard Edition and without the headaches of all the custom work arounds and monitoring that you had to do 10 years ago.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 10, 2015 at 10:11 am
Thanks Jeff for the reply.
I agree I can do Partitioned Views. Just some more background why I couldn't do it now.
First want to clear something up. I just inherited this system (a week back) as part of me starting with a new place. So trying to get things slowly done.
Now to the history, the company used to have data sharded among 50 servers (around year 2000) and someone managed to bring it down to 10 servers in 2008. All data is stored only as key/value pairs (survey system) or large chunks of XML. So most tables are long and never wide more than few columns. But, tones of data keeps coming in day in day out. 60% of data is not updated once written. Other 40% is frequently updated. All business rules/data sharding rules/replication strategy/pretty much everything is done only in API layer. During original build SQL server clustering was not utilized so data was sent from API layer to two different production environment (10 more servers) to handle database failures :). Database is merely used as a pure I/O system for OLTP reasons. On top of it the data grows (in last days) by 15GB-20GB every day (but we archive data frequently as well). Right now the entire DB size (without reporting) is at 890GB.
On top of it the data is replicated to 4 different areas (Finance, Canned Reporting, ad hoc reporting, legal needs) + 2 DR environment (one log shipped and one using backup/restore of TLogs). Tones of process run on these to flatten the data for easier querying//Report processing/Third party feeds.
Modifying all these is going to be a huge effort (May even involve using non-sql technology). So to start with I first wanted to reduce by two servers, then if that goes fine repeat it to reduce to two more and slowly to two more (more like different data on different servers then data sharding horizontally).
So to do that I want to understand the stress level of servers as we go along. This will also help me plan on getting different hardware.
Hope that explains little better.
Thanks,
Nachu
March 11, 2015 at 8:00 pm
One of the things I have running is a PerfMon that takes a sample every 15 minutes. I use PerfMon because I'm still currently stuck in the world of 2005. It does work well, though and it's a pretty easy import for analysis.
In 2008, there are some "new" system views that pertain to the OS and the hardware. You could actually sample those once ever 1 to 15 minutes (or whatever you feel would be good) and then plot them out later.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 13, 2015 at 3:10 am
Jeff Moden (3/11/2015)
One of the things I have running is a PerfMon that takes a sample every 15 minutes. I use PerfMon because I'm still currently stuck in the world of 2005. It does work well, though and it's a pretty easy import for analysis.In 2008, there are some "new" system views that pertain to the OS and the hardware. You could actually sample those once ever 1 to 15 minutes (or whatever you feel would be good) and then plot them out later.
Ditto on PerfMon for monitoring... Xperf is another one.
--------------------------------------------------
...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀
March 17, 2015 at 2:17 pm
Thanks Jeff and SQL Surfer.
I am looking at the DMVs in SQL 2008 to figure out the Waits instead of using PerfMon. I am trying to sample them as well.
Those DMVs tell what was the most utilized resource and what was the bottleneck. I am trying to see how to do the reverse of it. In the sense if I/O was the bottleneck from 10.-11pm then when was it all free and how much bandwidth is available.
I guess need to find out how to do that analysis. If I find anything else/learn I will post it back.
Thanks,
N
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply