January 29, 2018 at 9:05 pm
Comments posted to this topic are about the item Lowering the Noise
January 30, 2018 at 6:44 am
I am the developer for a web application and also support the application. I developed a Windows service that runs on a scheduled basis to see that it can retrieve the login page from the web server and that it can connect to the database and run a query. For problems related to the web server, it notifies the server hosting team. For problems related to the database, it notifies the DBA team. Once an error has been detected, it switches to a longer polling interval to give the team a chance to respond to and fix the problem. For afterhours, weekends, and holidays, the polling interval is longer than it would be during working hours. When connecting to the database, if it gets the "host not found" error, it notifies the server hosting team, besides the DBA team, because the VM or the server hosting the VM may be down.
I've used the same monitoring application as a shell to monitor two other systems that were extremely fragile; both applications have been retired. One of those systems used a PostgreSQL database. The other system was prone to database corruption, so I wrote a query and analysed the result to find out if a non-numeric character got into a column that was supposed to contain integers (the column is a character field). That database was a bizarre design, using character columns to store dates instead of using the date data type. Good riddance to that system!
January 30, 2018 at 8:11 am
Where I work, we use both SentryOne and SolarWinds. Honestly, I could spend more time familiarizing myself with the tools. Getting an alert every time an event falls outside a set of predefined threshold parameters can result in a LOT of alerts, especially when monitoring dozens of transactional servers. When it comes to monitoring, what I've instead found to be most useful is baseline analysis. Don't tell me when something potentially bad is happening, instead tell me when something out of the ordinary happens.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
January 30, 2018 at 8:45 am
What I would give to see all client environments have just disk monitoring to prevent them from filling up and stopping the show...
January 30, 2018 at 9:00 am
jmlakar 69347 - Tuesday, January 30, 2018 8:45 AMWhat I would give to see all client environments have just disk monitoring to prevent them from filling up and stopping the show...
There are fairly simple scripts that can help here. Scan for xx space, send an email / page / net send. The other thing to do is create placeholders that give you time to respond: https://voiceofthedba.com/2014/12/01/creating-placeholder-files/
January 30, 2018 at 9:54 am
Steve Jones - SSC Editor - Tuesday, January 30, 2018 9:00 AMjmlakar 69347 - Tuesday, January 30, 2018 8:45 AMWhat I would give to see all client environments have just disk monitoring to prevent them from filling up and stopping the show...There are fairly simple scripts that can help here. Scan for xx space, send an email / page / net send. The other thing to do is create placeholders that give you time to respond: https://voiceofthedba.com/2014/12/01/creating-placeholder-files/
Thanks Steve! I also love the SysInternals suite. Just wishing these same issues which have been solved for decades were not so common.
January 30, 2018 at 10:12 am
Generally what I want is a system that will tell me when there's an issue, what I do not want is someone sticking a "This job succeeded" notification onto the end of every job in existence. I have a rule in outlook that sends anything in that category directly to a folder called daily junk. On the other hand I also have rules that send alerts from production systems or other alerts worthy of immediate attention to a specific high priority folder so i know if i see something there it needs to be looked at.
January 30, 2018 at 4:51 pm
jmlakar 69347 - Tuesday, January 30, 2018 9:54 AMSteve Jones - SSC Editor - Tuesday, January 30, 2018 9:00 AMjmlakar 69347 - Tuesday, January 30, 2018 8:45 AMWhat I would give to see all client environments have just disk monitoring to prevent them from filling up and stopping the show...There are fairly simple scripts that can help here. Scan for xx space, send an email / page / net send. The other thing to do is create placeholders that give you time to respond: https://voiceofthedba.com/2014/12/01/creating-placeholder-files/
Thanks Steve! I also love the SysInternals suite. Just wishing these same issues which have been solved for decades were not so common.
I've typically held onto another volume that I can expand my primary into if needed. I also had a script that I should have saved that would mount the other volume and symlink back to the original. I can't tell you how many times that saved me to be able to just resize on the fly to whatever the admin would allow me to. This of course was all because I was afforded a secondary volume that was the same size of my primary volume.
January 30, 2018 at 4:53 pm
Steve Jones - SSC Editor - Tuesday, January 30, 2018 9:00 AMjmlakar 69347 - Tuesday, January 30, 2018 8:45 AMWhat I would give to see all client environments have just disk monitoring to prevent them from filling up and stopping the show...There are fairly simple scripts that can help here. Scan for xx space, send an email / page / net send. The other thing to do is create placeholders that give you time to respond: https://voiceofthedba.com/2014/12/01/creating-placeholder-files/
I love this. I use this tactic frequently in Linux environments to reserve space. Also wrote custom login scripts such that if there was low disk space, restarting the server would clear out a file or two to get some space back to work on the system. Not a frequent issue any longer though.
January 31, 2018 at 7:48 am
bpwilso - Tuesday, January 30, 2018 4:53 PMSteve Jones - SSC Editor - Tuesday, January 30, 2018 9:00 AMjmlakar 69347 - Tuesday, January 30, 2018 8:45 AMWhat I would give to see all client environments have just disk monitoring to prevent them from filling up and stopping the show...There are fairly simple scripts that can help here. Scan for xx space, send an email / page / net send. The other thing to do is create placeholders that give you time to respond: https://voiceofthedba.com/2014/12/01/creating-placeholder-files/
I love this. I use this tactic frequently in Linux environments to reserve space. Also wrote custom login scripts such that if there was low disk space, restarting the server would clear out a file or two to get some space back to work on the system. Not a frequent issue any longer though.
To reserve disk space for your database files to grow, why not just initialize or re-size the files to the appropriate size?
Creating placeholder files might be confusing to another DBA who is trying to resolve a disk space issue, but certainly make the file name descriptive in a way that someone else knows they can safely delete it if needed.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
January 31, 2018 at 8:09 am
jmlakar 69347 - Tuesday, January 30, 2018 8:45 AMWhat I would give to see all client environments have just disk monitoring to prevent them from filling up and stopping the show...
This can actually be done through SQL Server, scheduled as a nightly job, keep track of the disk space to predict when a disk might run out of space sometimes weeks or even months in the future, and drop a report into your email every morning that identifies disks by system that are starting to or are short on disk space as well as providing a "Removable Media Finder" to locate perhaps-misplaced thumb drives, installation CDs, etc.
The question is, are you, as a DBA, allowed to use xp_CmdShell? And, to be sure, I use this method on my production boxes and they don't have to be SQL Servers to be checked. It'll check any server that the SQL Server login can see in the domain.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 31, 2018 at 9:15 am
Eric M Russell - Wednesday, January 31, 2018 7:48 AMTo reserve disk space for your database files to grow, why not just initialize or re-size the files to the appropriate size?Creating placeholder files might be confusing to another DBA who is trying to resolve a disk space issue, but certainly make the file name descriptive in a way that someone else knows they can safely delete it if needed.
The reason is that unexpected things happen. If you've set db files to a size, and something happens too fast, or isnt' caught by monitoring, you're down. With placeholders, you buy time to fix things in an emergency. It's a quick fix that gets a system going while you work. Otherwise you have to solve the problem, find space, etc. before the system is running.
I do this on my laptop as well. At times I won't realize it's getting full and being able to pull 10GB out of thin air is nice.
January 31, 2018 at 9:23 am
Jeff Moden - Wednesday, January 31, 2018 8:09 AMjmlakar 69347 - Tuesday, January 30, 2018 8:45 AMWhat I would give to see all client environments have just disk monitoring to prevent them from filling up and stopping the show...This can actually be done through SQL Server, scheduled as a nightly job, keep track of the disk space to predict when a disk might run out of space sometimes weeks or even months in the future, and drop a report into your email every morning that identifies disks by system that are starting to or are short on disk space as well as providing a "Removable Media Finder" to locate perhaps-misplaced thumb drives, installation CDs, etc.
The question is, are you, as a DBA, allowed to use xp_CmdShell? And, to be sure, I use this method on my production boxes and they don't have to be SQL Servers to be checked. It'll check any server that the SQL Server login can see in the domain.
Hey Jeff - right and thanks. What I meant in my original message was how surprised I am to see so many shops still have trouble with the basics of monitoring disk space so their SQL Server doesn't come to a halt.
January 31, 2018 at 1:40 pm
One of the best ways that I've found to lower the noise is the one that many shops avoid for some reason and that is to identify the code that's causing a the problem. There are the occasional exceptions but it's usually the code that's causing the problem.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 1, 2018 at 7:42 am
Identify jobs and other processes that are obsolete and tombstone them. That's especially common in large enterprise IT shops with scores of production database servers and 100s or 1000s of jobs. I hate it when some complicated legacy process that I didn't even know existed suddenly starts raising errors or shows up at the top of a block chain. However, I love it when I get confirmation from business that it's actually obsolete and can be disabled. Not only does it cut down on alerts, but it reduces wasted IOPs and CPU as well.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply