Top 5 Container Metrics
1. Memory Consumption
Memory is the most critical metric when it comes to Docker containers because when the memory is full, the container stops working. Java applications usually face the problem of being terminated when memory isn’t configured correctly. When a Docker container’s memory hits the limit, Docker kills the process inside the container that is causing that. Docker protects the host by stopping the container so there’s no chance it will affect other containers running in the same host. What’s good about containers is that they can be started again in a matter of seconds if Docker stopped them. Even so, the application will have some downtime—even if it’s only for a short period. If you’re using Docker Compose, you can configure the container to start again when it’s stopped. Or if you’re using Kubernetes, you can configure the desired state of how many containers you want to be running all the time. Kubernetes will then continuously validate that the state is compliant by recreating containers. Memory is essential not only to avoid unexpected responses for an application, but also to set the guidelines for when to scale out the containers. The memory metric will set a threshold to define autoscaling rules. An auto-scaling rule might say, “If memory consumption is higher than 80 percent, then start two more containers.” It’s always best to scale out containers based on memory consumption to avoid unresponsive applications. It’s good to know what the memory consumption is, but you need to take preventive actions. Container orchestrators like Kubernetes enable you to do that.2. CPU Usage
Another critical metric in a container is CPU usage. In Docker, by default containers have full access to the CPU resources in the host. To avoid affecting other containers in the host, you can limit the percentage of CPU that Docker will allow a container to use. For example, let’s say that a container has a CPU configuration of 0.5. That means the container will have access to 50 percent at most of one CPU core from the host. Docker will not stop a container when the container reaches its CPU limit, but the application’s performance will be negatively impacted when this happens. With the CPU metric, you can quickly identify if you need to give more CPU resources to the container or not. Containers will help you to optimize resources but only if you know where you need to tune—for example if you’re configuring way too many CPU resources and the container only uses a small portion of those resources. As with memory, the CPU metric works as a threshold to configure autoscaling rules for the containers. In Kubernetes or any other container orchestrator, you can define autoscaling rules when a container reaches the CPU limit to add more containers. Therefore, applications performance will be steady because the orchestrator will make sure to provide as many containers as are needed.3. Disk Operations
Disk operations metrics like I/O operations or disk usage are numbers that could affect more than just a single container’s performance. The disk is a resource that both containers and the host depend on to provide good performance. You can get I/O metrics with the “docker stats” command. And you can get disk usage metrics with the “docker system df” and get more detailed information by adding the “-v” flag. Controlling the disk usage in the host is crucial to avoid problems when pulling new container images. Images could be huge, which is why you might need to run a clean-up process to remove old images. Logs written to the disk is another reason why disk usage increases. So “log rotation” is a good practice you still need to implement with containers. Even though containers are stateless by default, other applications are stateful like a database. For stateful applications, you’ll use volumes. A volume is how you map a folder or driver inside the container with another path or driver in the host. You could create volumes in the host before assigning them to a container. Disks become a shared resource that could affect containers’ and the host’s performance.4. Network Traffic
As you’ve seen with the previous metrics, the same type of metrics that are important in a virtual or physical server are still crucial for containers. So networking traffic can’t be missed when discussing the top metrics for containers. You could get networking traffic with the “docker stats” command to know the amount of data a container has sent or received. Knowing how much traffic is happening in a container could be a clue whether the host for a container is appropriate or not. Maybe the container needs to be running in a host that has better network performance. Or perhaps you need to scale out the containers. I’ve seen cases where specific applications, memory, and CPU were OK but the application’s performance was still poor. When taking a look at the networking metrics, I’ve found some interesting information. Sometimes, the network traffic was low, and it was because of a misconfiguration in the load balancer. Other times, the applications in the container weren’t able to use enough network bandwidth. My point here is that there were times where I wasn’t able to spot problems (for example because of a host type or a bug in the application) with memory or CPU metrics, but I was able to with network metrics.5. Number of Containers Running
To run containers at scale, you need to use a container orchestrator like Kubernetes or Swarm. There are going to be times when the orchestrator won’t be able to schedule more containers because there are no more resources available. To fix this, you could use the number of containers that are running to scale out the host. I noticed the importance of this metric when a friend used it to demonstrate that certain containers were having problems. It turns out that because of a misconfiguration, the container was running only for a period of time. Of course, the orchestrator was spinning up a new container but the application was unstable. You can get this metric by running a “docker container is” command or by asking the orchestrator how many containers are running for an application. In Kubernetes you can get this information by listing the pods that are running or by getting the state of a deployment object. The number of containers running is not a metric of a container per se. Sometimes you need to zoom out to get a different perspective on a problem.Don’t Rely Only Upon Top Metrics
I didn’t give you too many details on how to collect these types of metrics for containers. The reason for that is that you now have a lot of tools for that job, which are available as paid services or for free. Some examples of free tools are cAdvisor, InfluxDB, Grafana, or Prometheus. These tools will give you other metrics—not just for the containers, but also for the host. So don’t rely only on the top metrics I listed here. There are going to be times when these metrics look good and maybe the host is running out of IOPS, or the memory swap is terrible, which is when alternate metrics will come in handy. Knowing which are the top metrics for containers is just the beginning of improving the lead time. For example, you can also use these metrics when drawing the value stream mapping.Relevant Articles
Deployment RunBooks (aka Runsheets) Explained in Depth
Deploying software releases can be a challenging and complex process. Even small changes to a software system can have unintended consequences that can cause downtime, user frustration, and lost revenue. This is where deployment runbooks come in. A deployment runbook,...
11 Key Benefits of Application Portfolio Management
In digital‑first organizations, the application landscape is vast and constantly evolving. Departments add tools to meet immediate needs, legacy systems stick around for years, and new technologies emerge faster than they can be evaluated. It’s like finding your...
11 Application Portfolio Management Best Practices
Managing an enterprise application portfolio is no small feat. Over time, even the most disciplined organizations can end up with dozens—or even hundreds—of applications scattered across departments, many of which overlap in functionality or have outlived their...
Understanding The Different Types of Test Environment
As businesses continue to rely on software to carry out their operations, software testing has become increasingly important. One crucial aspect of testing is the test environment, which refers to the setup used for testing. This article focuses on the various types...
Data Masking in Salesforce: An Introductory Guide
Salesforce is a powerhouse for managing customer relationships, and that means it often stores your most sensitive customer data. But not every Salesforce environment is equally secure. Developers, testers, and training teams often work in sandbox environments that...
Release Dashboards: How to Improve Visibility and Control
When software releases go wrong, it’s rarely because someone dropped the ball. Usually, it’s because no one had a clear picture of what was happening. Without visibility, things slip through the cracks. Deadlines get missed, bugs sneak in, and teams spend their time...