Monitoring is crucial in modern IT environments. There are many different infrastructure and application layers to monitor. This makes effective monitoring more complicated. While the general idea of monitoring is the same as in the past, the actual implementation will be slightly different, depending on your infrastructure. In modern environments, in addition to physical and virtual servers and serverless infrastructure, you see broad deployment of containers. In many cases, you’ll see Docker and Kubernetes. Docker is one of the most popular container runtimes, and Kubernetes is a widely used container orchestrator.
There are other container runtimes and orchestrators, but you’ll have to solve container monitoring challenges no matter what you choose here. In this post, you’ll learn what these challenges are and how to monitor containers in modern IT environments effectively.
Before we dive into the specifics of container monitoring, let’s answer one simple question. Why can’t we monitor containers the same way we monitor virtual machines? After all, some people call containers “lightweight VMs.” Well, only in theory are containers similar to virtual machines. In practice, they work in a very different way. Of course, if you try to apply your typical monitoring solution to containers, it will probably work, but it will be far from perfect.
If you had only a few containers, then you probably wouldn’t need to worry too much about specific container monitoring tips. Monitoring containers like VMs wouldn’t be much different; after all, it’s about making sure you have enough capacity to run your applications. However, containers are implemented together with microservices architecture in most cases. And when containers are implemented with microservices, your monitoring approach needs to be different.
Why is this? To answer that, you need to understand what microservices are. Traditionally, a monolith application is deployed on a server as one single piece, one block of code with, for the most part, the entire application. Therefore, monitoring is straightforward. For example, to check if the application is running or how much CPU and RAM it’s using, you only need to check metrics for one binary.
With microservices, however, your application splits into many small pieces or modules. Each piece is packaged and could be configured to run on one container. Therefore, to monitor your application’s health, you no longer have one binary to check but many containers. Almost every aspect of monitoring becomes distributed. Do you see the issue yet? If you monitored your containers like virtual machines, you’d lose the big picture. You wouldn’t be able to answer simple questions. Is my application up and running? How much memory does my application need?
When it comes to container monitoring, you need to shift your focus from monitoring individual containers into monitoring the whole container cluster or at least a group of containers. It’s still beneficial to keep an eye on metrics from all individual containers. At the end of the day, each microservice acts as a mini-application. Monitoring individual containers can help you find the bottlenecks and weak spots. But individual container monitoring should be a secondary focus.
Managing a few containers is doable, but in a typical environment, you’ll see dozens if not hundreds of containers. Managing this amount isn’t something you want to do manually. Unless you’re doing some simple local tests, you’ll probably use container orchestration tools like Kubernetes. On one side, this creates more challenges for monitoring, but on the other, it makes some aspects easier. But before we dive into this, you need to understand what a container orchestrator does.
Tools like Kubernetes manage containers for you—creating and destroying containers based on your input, restarting failed containers, and providing networking, service discovery, and storage functionality for containers, and many more. In the context of monitoring, part of an orchestrator’s job is to distribute the load evenly across all the nodes in the cluster and kill the containers that overuse resources or misbehave in some way. Traditional monitoring would be very inefficient here.
When a virtual machine restarts, you probably want to be alerted. Container restarts, however, are normal and expected. Resource usage is also something to redefine. When a virtual machine uses, for example, 90% of CPU and 100% of RAM, you know exactly how much this is because these percentage values are directly related to the resources assigned to the machine.
With containers, it’s not so simple. By default, containers can use all the resources from the underlying host machine. But when you have more than one container running on a machine, they share resources dynamically between them. A simple percentage value won’t tell you much. When an orchestrator manages containers, they’ll probably have resource limits applied. In such cases, the percentage value will be referring to a limit, so you’ll need to know the limit to understand the percentage value. Many alerts created for virtual machines can be misleading when applied to containers.
Traditional monitoring assumes you have one or two infrastructure layers. You either monitor physical servers directly or physical servers and virtual machines running on them. And while you can run containers directly on physical servers, in most cases, infrastructure that includes containers will have much more layers.
Cloud, container orchestrator, containers, application—this is where traditional monitoring tools suffer. Sure, most of them can monitor all the layers, but separately. It’s up to you to understand the relations between the different metrics and, for example, figure out if the degraded performance issue comes from the application itself, container configuration, orchestrator overload, or underlying machine performance.
Modern monitoring tools are doing the opposite. They monitor all the different layers of your infrastructure as one. They understand the relations between them and can automatically correlate data between them.
Another good practice for monitoring modern, container-based IT environments is to treat logs as part of monitoring, not a separate thing. Leveraging log management tools as part of your monitoring completes the picture and provides insights you can’t get from only looking at metrics or traces. It comes back to the correlation abilities of modern monitoring tools. Your monitoring solution can determine that, for example, before your customer started experiencing a lot of delays or errors, specific error messages were logged in. Or the number of HTTP 500 codes are related to specific transactions you can see in the logs. You can read more here about the benefits of combining logging with monitoring.
When it comes to containers, logging is a challenge on its own. Your whole application is distributed into many containers, and each of these containers produces its own logs. You need to somehow get all these different logs from many places and make sense of them. In a containerized world, you need a centralized log solution. You need to ship logs from all containers into one place and let a log analytics tool do the rest of the job. You can learn more about container logging techniques here.
Finally, visualizing how your containers talk to each other is important to retain a general overview of your infrastructure. With traditional, monolithic applications, it’s relatively easy to understand the network flow. Requests are coming in from front end to back end, and the back end is connecting to the database and possibly to some other services to process and send responses back to the client. With microservices, networking is very complicated. Network traffic will be flowing endlessly between all the microservices, and it won’t be easy to distinguish what kind of traffic it is because most microservices will talk to each other via HTTP REST API. Modern monitoring tools can create a visual representation of all this traffic. For us human beings, it’s the most efficient way to understand the relationships between microservices.
Though some say microservices are “lightweight virtual machines,” container monitoring differs from virtual machine monitoring. Don’t be fooled by old-school solutions that can monitor containers but do so without any reference and without getting context from container orchestration or cloud layer. Modern monitoring tools understand container workflows much better and are much more efficient in providing you useful information without bombarding you with counterproductive alerts.
Tools like SolarWinds® AppOptics™ or SolarWinds Loggly® have many built-in integrations. Creating comprehensive monitoring solutions for complex, multilayer, microservices, and container-based infrastructures doesn’t need to be complicated and time-consuming. You can sign up here for a free trial and see for yourself how easy it is.
This post was written by Dawid Ziolkowski. Dawid has ten years of experience as a Network/System Engineer at the beginning, DevOps in between, and Cloud-Native Engineer recently. He’s worked for an IT outsourcing company, a research institute, a telco, a hosting company, and a consultancy company, so he’s gathered a lot of knowledge from different perspectives. Nowadays, he’s helping companies move to the cloud or redesign their infrastructure for a more Cloud-Native approach.