Memory Usage data capture

Question

Memory Usage data capture

mtz676

SSCrazy Eights

Points: 9251
More actions
November 16, 2023 at 3:34 pm

#4328968

Can we capture memory usage continously ?
December 20, 2023 at 6:56 pm

This was removed by the editor as SPAM

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Mr. Brian Gale SSC-Insane Points: 24799 More actions · Answer 1

Are you meaning per query or overall? and what do you mean by "continuously"? Like you want a realtime constant monitoring of overall memory usage? If so, then the easiest way to do that is with WMI or perfmon. If it is per-query, that's a hard thing to capture "continuously" as queries tend to start and stop and once they complete, some data remains in memory, but that specific session may not be active. If you mean overall, task manager will provide that information to you. Once SQL gets memory, it is not going to give it up unless something forces it to.

In general, the queries you run against SQL Server are all for point in time information. If you need to continuously monitor something, you would need to use 3rd party tools. SSMS does have an "activity monitor" but that thing is a performance hog and if you watch it long term, it can get sluggish and sometimes even stop refreshing the data.

The question I'd have for you though is why? Why do you need this data to be captured continuously? What will that information help you with?

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

mtz676 SSCrazy Eights Points: 9251 More actions · Answer 2

one of our servers : memory usage is now 98%, which is above the error threshold. We keep getting this message frequently.

frederico_fonseca SSCoach Points: 16208 More actions · Answer 3

have a look at these https://www.webfx.com/blog/web-design/10-free-server-network-monitoring-tools-that-kick-ass/

my company, for some of their servers is now using https://www.zabbix.com/

Mr. Brian Gale SSC-Insane Points: 24799 More actions · Answer 4

It sounds like you have a tool that monitors memory usage on the server. If you want a per-process app that'll do that, perfmon or task manager will monitor your memory usage and are pre-installed on the computer. Something to note though - running any tools ON the server will use up some resources (CPU, memory), so I would recommend using a perfmon and running it on a secondary machine to capture metrics on the server. Free tool that comes with Windows and as long as the proper ports are open and proper permissions are applied to the user account, you can pull all sorts of metrics out of it.

Since it sounds like the alert is at the OS level, not the application level (do correct me if I am wrong on this), it MAY or MAY NOT be SQL that is using all the memory. If it is SQL, I would recommend looking at the max memory value you have assigned and if it is default (2 PB), you may want to look at decreasing that value to something reasonable.

To add to the above though, having "98%" used memory may not be a bad thing. If you have 256 GB of memory in the system, 2% is still 5 GB of memory (approximately) that is unused. Now if you have 64 GB of memory in the system, that's only 1.2 GB of free memory and I'd be concerned that the OS may not have enough free memory. My preference when setting up alerts is to set them up at a GB level, not percent; similar to how I set up autogrow. I like my OS to have 4 GB free so I can start a new session (ie RDP in) if needed (which is rare) and for the OS to use, and then the rest is allocated to SQL and other applications. When I am provisioning a new system, I get the bare OS running, then check how much memory is in use. It varies depending on what is installed on it by IT post OS, so that gives me a baseline of how much the OS needs. Then I add 4 GB so that I have enough for an RDP session, and the remaining goes off to SQL Server. I do work with IT on it as well though as I may have missed some scheduled processes such as an antivirus and I don't want their tools to crash or cause high CPU usage due to limited memory.

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

mtz676 SSCrazy Eights Points: 9251 More actions · Answer 5

I would like to know the memory consumption distribution(SQL & NON-SQL) at the time the alert occurs. If it is is sql then what within sql is causing it ? something like this...

Mr. Brian Gale SSC-Insane Points: 24799 More actions · Answer 6

My opinion, you are going to need a tool to do that. Offhand, I am not sure what tool can handle it, but anything that supports WMI should be able to query that data. I mean, you COULD script something up in powershell, but that is beyond my skills. And, I am apparently showing my age - WMI in powershell was dropped in powershell 3.0 in favor of CIM. So, if you want to go that route, I'd do some research on powershell and CIM.

Alternately, you could run perfmon with a 1000 second window (largest it allows) and then hope that when the alert comes in, it is within the window for you to review what happened. Perfmon is not the most configurable tool, but it does allow for you to monitor memory utilization and if memory serves it allows you to monitor SQL memory utilization. It can just be a pain to try to figure out exactly which metrics you need to watch. Nice thing is once it is set up how you like, you can save the config so it is easy to reload in the future.

That being said, SQL Server, by design, will request memory whenever it needs it up to the max server memory. If your alert tool needs there to be a certain percentage free, then your options (in my opinion) are to make sure that SQL is leaving enough room for the OS and other processes to keep everything below the threshold, or to adjust the alerts. Once SQL claims the memory, it won't give it back to the system unless you force it to do so. So when the alert system sees spikes, if you aren't doing anything in SQL to free the memory back up, and the problem resolves itself, then it isn't very likely that SQL is triggering the alerts.

My opinion though - unused CPU and unused memory is a sign of wasted resources. If the CPU and memory are not running at or around 90% the majority of the time, your server is under utilized. I know my server CPU is under-utilized - we rarely break 80% let alone 90% for CPU, but I have the memory tuned pretty good so we are often at or around 95% used. The exception is on our SSIS and SSRS boxes as those 2 need wiggle room for SSIS and SSRS along with the SQL instances (what a silly licensing model...).

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Jeff Moden SSC Guru Points: 1004577 More actions · Answer 7

mtz676 wrote:

I would like to know the memory consumption distribution(SQL & NON-SQL) at the time the alert occurs. If it is is sql then what within sql is causing it ? something like this...

Hmmm... how much memory does the system have and what have you set the "Max Memory" setting in SQL Server to?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

mtz676 SSCrazy Eights Points: 9251 More actions · Answer 8

SQL - 471 GB

System - 512 GB

We also sometimes get these messages ""I/O requests taking longer than 15 seconds" " . Does this indicate hardware issues ?

This reply was modified 1 year, 4 months ago by mtz676.

mtz676 SSCrazy Eights Points: 9251 More actions · Answer 9

I think in the higher versions of SQL 2019 ... we are able to capture CPU usage before it gets recycled from the DMV every 4 hrs. Can we do something similar for memory ? Thanks

cyrusbaratt SSC-Addicted Points: 430 More actions · Answer 10

I would not use some of these monitoring tools. They slow things down.

Perfmon gives you good performance graphs for KPI. Please check out Idera.

DBASupport

cyrusbaratt SSC-Addicted Points: 430 More actions · Answer 11

If your max memory is 512 GB. I would give 1/4 to the OS and apps. The rest goes to SQL.

If you have VMs, I would initialize the Min memory to 32 GB to give the VMs a boost.

Note: As always, please check these settings in a test platform to make sure they work well.

DBASupport

Mr. Brian Gale SSC-Insane Points: 24799 More actions · Answer 12

Cyrusbaratt, I am a bit confused why you say that monitoring tools slow things down then suggest Idera, which is a 3rd party monitoring tool (plus a ton of other SQL tools... I do like their stuff)! I run RedGate SQL Monitor and it is a good tool for monitoring and reporting. Maybe not as feature rich as some other tools out there, but you can't beat the price point (I do not work for RedGate, I just like their tools).

I am also confused how "perfmon" gives you "KPI's"? It gives you realtime data and a small window of historical data. It can be used by KPI tools (such as Splunk), but I've never heard of anyone using it as a KPI tool. The window for looking at the data is too small and requires someone to actively monitor it. I am not sure how you would use it as a KPI tool...

As for giving the OS 1/4 of the memory, how did you get that number? do you KNOW for sure that the OS and other apps need 1/4 the memory? And if it is in a VM, I would want the min memory of the VM to be at LEAST the same as the min memory for SQL Server so I don't run into a case where the VM is memory starved due to SQL requesting more (up to its max) and not being able to get it. If the VM host doesn't have enough memory to give to the VM when it is requested, you are going to run into problems with SQL paging to disk which is an incredibly slow operation. Now, as for what I'd set the min memory for SQL Server to, it really depends on the SQL instance. If it is frequently used and frequently near the max memory value, then my min memory would be near the max as I want SQL to be able to run full tilt without slowdown caused by not having enough memory. If the SQL instance has "spikes" where the memory is using near max but is usually running about 25% of max, then my min memory is set to 25% of max memory. Initial setup, I give it a min memory of 1 GB and max memory of whatever I have available based on some calculations I do on the system, and then let it run for a week and monitor it. Then I tweak min memory and leave max memory and let it run for a month and see if the first week trend continues. Once I have a baseline for how the system runs, I tweak the max memory. Then, time permitting, I revisit it quarterly to see if the values still make sense. For example, if I decommission an instance running on Server A, I may have freed up 50 GB of memory. Do I have an instance that could benefit from an extra 50 GB or should I save that memory for an upcoming project that needs a new SQL instance?

I know some experts (such as Brent Ozar) recommend 4 GB for the OS OR 10% of the total memory. I have found that with my systems, if I ONLY have the OS and SQL Server installed, that 4 GB is good, but I like wiggle room in case I need to RDP in, so I double that to 8 GB for the OS and me. Then I work with IT to determine how much their tools need and whatever is left over I give to my SQL instances (yes, instances... plural... not the best setup, but was a requirement due to versions). I personally have never heard the recommendation to leave 25% of the memory for the OS. How much do you recommend to leave for an RDP session and other tools (like backup and antivirus)?

My approach would be to remotely look at the memory utilization when things start spiking and alerts are going out. Find out what processes are running and how much memory they are using. Then diagnose and debug what is causing the spikes. Since they are not constant, SQL is likely not the culprit. If the spikes are predictable (ie they happen on a schedule, even if SOMETIMES the expected spike doesn't happen), then it is likely a scheduled process (antivirus, chkdsk, windows updates, etc) that are triggering the spikes.

As for capturing things from DMV's, that is only going to show you SQL specific data or server specific data (such as total memory on the server), not individual apps on the system, if I remember right. So you won't see spikes caused by non-sql processes. BUT if you are really wanting to capture this in SQL, you could set up a scheduled job to run every 5 minutes (or whatever time makes sense to you) that calls out to xp_cmdshell to run tasklist and store the results into a table or a file on disk or whatever makes sense in your scenario. BUT if SQL is configured to have a max memory of 471GB, then the most it will ever use is 471GB BUT if it ever does get up to 471GB, it will keep using 471GB until the services are restarted (not recommended) or you flush the memory manually (not recommended) or in the case of a VM, the VM ballooning occurs and the total memory on the server drops so SQL has to release some.

Now, that being said, there are DMV's for memory - sys.dm_os_memory_allocations and sys.dm_os_sys_memory come to mind, but they MAY not tell you what you need to know - what is using the memory when things spike. If you want to see all of the system objects related to memory (as the 2 I listed may not suit your needs), you can run this:

SELECT name
FROM sys.system_objects
WHERE name LIKE '%memory%'
and type = 'V'

and pick which DMV suits your needs. If you aren't sure, then just select from them and see if the data in it suits your needs. They are just views, so you won't change or break any data by selecting from them.

I do agree with cyrusbaratt that you should always test on a test system, but configuring memory is not something that needs "testing" if done correctly. You have real-world data and can make informed decisions on the changes. Plus, a test system generally has less resources than production AND less traffic hitting the system. So if you do memory analytics on a test system you could conclude that your SQL instance can get by with 10 GB of memory but your production instance needs 471 GB or more. You have the data (or your server admin has the data and can provide it to you) - review it and make informed decisions on it. It's like setting the data file size - there is no need for a TEST prior to going live with a data file size increase. As long as you have free disk space, increasing the data file will succeed and won't cause issues. Blocking and performance hits during the grow, but if IFI (instant file initialization) is on, then grows happen very fast at the risk of not zeroing out the data before allocating the disk space.

Now as for I/O requests taking longer than 15 seconds, that to me indicates that too much is happening on the disk. The most common reason I've seen that happen is because the antivirus is hitting the SQL data file, but it isn't the only cause. It can happen if there is too many other processes writing to the same disk - like if the OS, the SQL data files, and the SQL log files are all on the same physical disk, you can get slowdowns. Also, if you are running out of memory and SQL needs to page to disk, then you can get that error too if I remember right. Basically, your query is requesting data from disk and something is preventing it from getting the data. Could be a defrag running or a chkdsk or a backup process or an AV scan or anything that puts your request for data from disk into the queue to wait for access. If possible, having the OS, the page file, the SQL Data file, and the SQL log file all on separate SSD's will give you the best performance, but this is not always possible. If not possible, then it is best to test the configurations at your disposal to find what is "best" for your situation. Having all 4 on separate HDD's is still better than having any of them on shared disk, but if I had to put it on shared disk, the OS and page file would share 1 disk and the SQL stuff would share the other disk.

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

cyrusbaratt SSC-Addicted Points: 430 More actions · Answer 13

Few comments here:

Some monitoring tools are good, and some are not so good. So you need to bench-mark these tools before using in a live platform.

SQL memory Min/Max memory may be adjusted for the best performance.

Perfmon tells you if you, for example, if the SQL buffer cache ration is at good level. Perhaps KPI may not be the proper wording here.

Note: As always, please check these settings in a test platform to make sure they work well. proper bench-marking is everything.

DBASupport

Memory Usage data capture

Cookies on SQLServerCentral