Why Cloud “Server” Tagging Strategies Are Important



by Christian Meléndez


A key part of Enterprise IT Intelligence is understanding you Cloud Resources. And where better to start than using the “elegant” concept of Tagging. A concept used across the various cloud and infrastructure management platforms. Follows is an article from Christian Meléndez.



A tag is a key/value label to identify a resource in a cloud environment. You might want to know why a server exists and what purpose it serves. AWS has made sure that a simple concept like a tag can become a necessary tool for all of your resources. Tags are free, and they can do more than just give a name to a server. Having a solid tagging strategy will allow you to support the management of your IT delivery life-cycle and your IT solutions.

My first encounter with the cloud happened when working in a previous position. An initial project was to reduce the cloud spending. At the time, this company didn’t even have a naming convention for servers, so trying to identify which server was still in use was challenging. Who created the server? Why was it created? Is it still needed? Consumption metrics weren’t a good indicator of a server’s usefulness because it may have been created in advance for an upcoming project. Or maybe its only consumption was the load balancer doing health checks.

But is identifying resources the only purpose of tags? No. So why else would you use a tag, you might wonder? Well, that’s a good question, and it also happens to be the topic of this post! Let’s find out in more depth.

To Identify Cost

When I was trying to reduce costs in the company I mentioned earlier, the first thing we needed to do was to tag the resources appropriately. We started with the most expensive resources by setting a “Cost Center” tag. We were working with a partner to get cost reports with daily updates. AWS didn’t have cost reports at that time, so we needed to find a way to confirm the bill was going down.

After we determined that having a cost center tag was useful, we started using different types of tags. We used tags like the username, environment, project, and system; these allowed us to know which departments were spending more and to decide if the department’s costs were proportional to its revenue. Every time we detected a change in costs, it was easy for us to explain why it happened: a new project was coming, someone ran load tests, or an environment was terminated.

A good tagging strategy will help you to identify who’s wasting resources and why. Even though we were able to tag all existing resources, there were times that a resource lacked a proper tag. Make sure you solve this problem from the root to have a proactive strategy, not a reactive one.

To Automate Operability

Another reason tagging is important is that it helps you automate your operability. Automation in a cloud environment is never up for debate. Without automation, you’re losing essential benefits of the cloud.

The moment I knew tags were notable for more than just allocating cost was when I started using Ansible. Ansible is a configuration management tool that we used to configure servers after every deployment. The way Ansible knows which servers to configure is by having an inventory with the list of IP addresses.

But wait—having a list of IP addresses in an environment where IPs come and go isn’t sustainable! Here’s where tags come into play again. Working with dynamic environments is one of the best value propositions from Ansible. You only need to create the inventory based on the tags you want to use to identify a server. Then Ansible will build the environment dynamically by querying the AWS API to get the list of IPs.

Another great use case for tags is when you need to run clean-up tasks like deleting old volume snapshots. You could also make use of tags to have instances only during business hours. Or you can create an expiration date tag for the resource.

Tags change the way you manage the infrastructure by introducing automation for in your IT delivery life-cycle.

To Set Proper Monitoring and Alerting Policies

Tagging is also going to help you pay attention to the things you should.

When it comes to monitoring and alerting, one of the things you want is to reduce noise. When there’s too much noise, you start ignoring the alerts. An alert’s purpose is to trigger an action from someone or something. If it doesn’t do that, why bother creating the alarm? You could collect all the metrics you think you’ll need from your resources, but you need a way to identify events. One way to organize and reduce noise is by using tags.

A fundamental principle in continuous delivery is to work with production-like environments, and this includes feedback. You might be surprised by the number of companies I’ve seen that decide to only monitor production resources. While this will save you some money,  you won’t receive feedback until much later in the delivery pipeline. Another reason behind less feedback is that IT Ops folks don’t want to deal with alerts from non-production environments

In my experience, what works is to have different retention and alert policies for metrics data per environment. You don’t lose visibility, and by using tags, you can set policies like, “Don’t alert the on-call engineer for the non-production environment.” And this policy could also be accompanied by another one, like, “For the non-production environments, alert the developers while they’re in the office.”

Using tags for monitoring and alerting will help you to reduce the noise of monitoring and alerting.

To Discover Wasters

We talked a little bit about using tags to identify where you’re spending the money, but you could also spot where you’re overprovisioning resources in the cloud. When you’re moving to the cloud, it’s essential you know what you currently have. You need to discover and inventory your resources so that you know which type of and how many instances you’ll need. You don’t want to be creating support tickets to increase limits while you run the migration.

But it’s pretty common that we initially overprovision resources because we find that one CPU in the cloud is not the same as one CPU on-prem. Or we find that the cloud provider only offers you pre-defined configurations for CPU and memory—excluding Google.

By using tags, you’re able to know for which workloads you need to have enhanced networking or memory optimized instances. When you choose optimized instances for your needs, you might end up needing fewer servers.

To Apply Restriction Controls

After you’ve seen the benefits of using tags, you might want to make sure you don’t have untagged resources anymore. Well, you can enforce tags and their value every time a new resource is created. It’s always better to receive an error than to have to find out later who created the resource and why. Enforcing tags is not a well-supported feature in all cloud providers, meaning that you might not receive descriptive errors. For that reason, consider having a checklist document, templates, or pre-built catalogs.

Another use case around restriction is that for some resources you can grant permissions based on tags. For example, you could use a tag to identify which project the resource belongs to. Then, you create a policy permission to restrict access based on tags. When a user has these types of policies attached, it’s going to be impossible to have access to resources other than what the project requires. A developer for the consumer site won’t have access to backend services.

Using tags for restrictions is an easy way to create generic policy permissions.

Tags Are Your Infrastructure Metadata

Tags are a straightforward concept: a key/value pair that you use to identify a resource. But as I described in today’s post, tags can become your best ally when managing your cloud resources. Tags don’t get coupled with the applications you’re running in a server; you can always change their value to any existing infrastructure. You could still change your mind about your tag strategy, but that shouldn’t mean you need to reprovision or restart services.

In the past, a good naming convention was important. You knew a lot about a resource just by reading its name. It’s still essential to have a good naming convention, but sometimes just a name isn’t enough; you need to have more context and to reduce manual labor. Tags are a perfect complement when you want to leverage work on automation.

Don’t underestimate tags because of the simplicity of the concept; they’re a powerful tool.

Christian Meléndez

This post was written by Christian Meléndez. A regular poster for Enov8, Christian is a technologist that started as a software developer and has more recently become a cloud architect focused on implementing continuous delivery pipelines with applications in several flavors, including .NET, Node.js, and Java, often using Docker containers.

Relevant Articles

Top 5 Cloud Metrics

21 MARCH, 2019 by Mark Henke Preamble These days, deploying our services to the cloud just makes sense. Deploying to the cloud means you’re letting someone else handle low-level infrastructure costs, which gives us incredible flexibility. And with such flexibility...

Bimodal Explained – A Release Management View

15 March, 2019 by Vlad Georgescu Bimodal Explained *From a Release Management Perspective “The Hare and the Tortoise” is Aesop’s most famous fable. In it, we find two characters: one fast (the hare), and one slow (the tortoise). They run a race against each other,...

Self-Healing Applications

11 March, 2019 by Sylvia Froncza An IT and Test Environment Perspective Traditionally, test environments have been difficult to manage. For one, data exists in unpredictable or unknown states. Additionally, various applications and services contain unknown versions or...

Serverless Explained

    04   March, 2019 by Eric Goebelbecker Preamble “IT & Test Environments are a changing!” The days off Physical Servers has transitioned to Virtual, Which is transitioning to Containers, and now we have "Serverless". It is all getting very complicated … Or is it?  ...

IT Environments – Top 5 Deployment Metrics

03 February, 2019 By Christian Meléndez. Preamble: Following on from our previous blog on Enterprise Release Management bench-marking, Christian talks about the “sister disciple” of deployment management and more specifically the top 5 metrics one could use to better...

IT & Test Environment Management Anti Patterns #2 – Containers

29 JANUARY, 2019 By Sylvia Fronczak.   Anti-patterns often arise when developers find creative or outside-the-box uses for a technology - or when they find workarounds for limitations that were specifically put in place by the creators of the technology to prevent...