Scaling Up Monitoring

Monitoring databases is important when it's the systems that are in production. Operations departments know that catching issues early, being proactive, and having data to troubleshoot issues make their job easier. Not having these things makes their job much more stressful.

Most of us work with data in some way and the availability of that is important. Certainly, security, integrity, and performance matter as well, but availability is key. Many organizations don't have any monitoring systems set up. Instead, they troubleshoot problems when someone files a ticket or calls. I'm amazed at this, though I know building and managing a monitoring system is hard and purchasing third-party products can be outside of your budget. Still, having something in place makes everyone's job easier.

If you decide to build a system, then you can do it in many ways. I saw a description of how Amazon built a monitoring system for their Prime Video service. This is a more complex system than many of us deal with for databases, but I did find it interesting that they chose a distributed architecture that used multiple components. It didn't scale, so they started to move from small functions, almost like microservices, to a bit more monolithic structure.

I am not saying that microservices or functions or serverless are bad choices. They meet certain needs, and they can work very well. Azure SQL Database Serverless can work well in some situations. However, I do think that this was a case of engineers trying to be too clever and making assumptions about production loads from PoC-type experimentation.

I would say that far too many software engineers think that their solution will scale without actually testing it. Too often their view is if it works here, it will work there, but the history of software has shown that working on my machine doesn't mean working on another. That's why we use Continuous Integration: for independent validation and verification. This is also a problem when databases are involved, as the level of data used for development and testing doesn't do a good enough job of predicting how the system works under load. We need better test data management, which is becoming a whole new category of software practices and tools.

We should ensure we include good instrumentation in our software for monitoring purposes, but we should also ensure that we start monitoring and evaluating how our system will perform in test and development environments, as that's the idea of shift-left. Lastly, I think monitoring in production is important, but I wouldn't build another system. I admit I'm biased, as I work for a company selling monitoring software. However, I also think the build v buy debate doesn't make sense here unless your staff has a lot of spare time to spend maintaining a homegrown system. I'd like to think most of them have better things to do.

The Complexity of Metrics

by Steve Jones

SQLServerCentral

We need to monitor our servers, but individual metrics have more complexity than just setting simple limits for their readings.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2023-02-06 (first published: 2023-01-30)

276 reads

Discuss

Monitoring Azure SQL Databases

by Johan Åhlén

SQLServerCentral

There are many reasons you should monitor your databases, including avoiding performance problems or running out of disk space. Ideally, you want a scalable monitoring solution where you can monitor all your SQL databases in one single place. This article will describe two options that are available: Azure SQL Analytics and Azure SQL Insights. Both […]