Why Cloud “Server” Tagging Strategies Are Important

15

OCTOBER, 2018 by Christian Meléndez

Preamble

A key part of Enterprise IT Intelligence is understanding you Cloud Resources. And where better to start than using the “elegant” concept of Tagging. A concept used across the various cloud and infrastructure management platforms. Follows is an article from Christian Meléndez.

Introduction

A tag is a key/value label to identify a resource in a cloud environment. You might want to know why a server exists and what purpose it serves. AWS has made sure that a simple concept like a tag can become a necessary tool for all of your resources. Tags are free, and they can do more than just give a name to a server. Having a solid tagging strategy will allow you to support the management of your IT delivery life-cycle and your IT solutions. My first encounter with the cloud happened when working in a previous position. An initial project was to reduce the cloud spending. At the time, this company didn’t even have a naming convention for servers, so trying to identify which server was still in use was challenging. Who created the server? Why was it created? Is it still needed? Consumption metrics weren’t a good indicator of a server’s usefulness because it may have been created in advance for an upcoming project. Or maybe its only consumption was the load balancer doing health checks. But is identifying resources the only purpose of tags? No. So why else would you use a tag, you might wonder? Well, that’s a good question, and it also happens to be the topic of this post! Let’s find out in more depth.

To Identify Cost

When I was trying to reduce costs in the company I mentioned earlier, the first thing we needed to do was to tag the resources appropriately. We started with the most expensive resources by setting a “Cost Center” tag. We were working with a partner to get cost reports with daily updates. AWS didn’t have cost reports at that time, so we needed to find a way to confirm the bill was going down. After we determined that having a cost center tag was useful, we started using different types of tags. We used tags like the username, environment, project, and system; these allowed us to know which departments were spending more and to decide if the department’s costs were proportional to its revenue. Every time we detected a change in costs, it was easy for us to explain why it happened: a new project was coming, someone ran load tests, or an environment was terminated. A good tagging strategy will help you to identify who’s wasting resources and why. Even though we were able to tag all existing resources, there were times that a resource lacked a proper tag. Make sure you solve this problem from the root to have a proactive strategy, not a reactive one.

To Automate Operability

Another reason tagging is important is that it helps you automate your operability. Automation in a cloud environment is never up for debate. Without automation, you’re losing essential benefits of the cloud. The moment I knew tags were notable for more than just allocating cost was when I started using Ansible. Ansible is a configuration management tool that we used to configure servers after every deployment. The way Ansible knows which servers to configure is by having an inventory with the list of IP addresses. But wait—having a list of IP addresses in an environment where IPs come and go isn’t sustainable! Here’s where tags come into play again. Working with dynamic environments is one of the best value propositions from Ansible. You only need to create the inventory based on the tags you want to use to identify a server. Then Ansible will build the environment dynamically by querying the AWS API to get the list of IPs. Another great use case for tags is when you need to run clean-up tasks like deleting old volume snapshots. You could also make use of tags to have instances only during business hours. Or you can create an expiration date tag for the resource. Tags change the way you manage the infrastructure by introducing automation for in your IT delivery life-cycle.

To Set Proper Monitoring and Alerting Policies

Tagging is also going to help you pay attention to the things you should. When it comes to monitoring and alerting, one of the things you want is to reduce noise. When there’s too much noise, you start ignoring the alerts. An alert’s purpose is to trigger an action from someone or something. If it doesn’t do that, why bother creating the alarm? You could collect all the metrics you think you’ll need from your resources, but you need a way to identify events. One way to organize and reduce noise is by using tags. A fundamental principle in continuous delivery is to work with production-like environments, and this includes feedback. You might be surprised by the number of companies I’ve seen that decide to only monitor production resources. While this will save you some money, you won’t receive feedback until much later in the delivery pipeline. Another reason behind less feedback is that IT Ops folks don’t want to deal with alerts from non-production environments In my experience, what works is to have different retention and alert policies for metrics data per environment. You don’t lose visibility, and by using tags, you can set policies like, “Don’t alert the on-call engineer for the non-production environment.” And this policy could also be accompanied by another one, like, “For the non-production environments, alert the developers while they’re in the office.” Using tags for monitoring and alerting will help you to reduce the noise of monitoring and alerting.

To Discover Wasters

We talked a little bit about using tags to identify where you’re spending the money, but you could also spot where you’re overprovisioning resources in the cloud. When you’re moving to the cloud, it’s essential you know what you currently have. You need to discover and inventory your resources so that you know which type of and how many instances you’ll need. You don’t want to be creating support tickets to increase limits while you run the migration. But it’s pretty common that we initially overprovision resources because we find that one CPU in the cloud is not the same as one CPU on-prem. Or we find that the cloud provider only offers you pre-defined configurations for CPU and memory—excluding Google. By using tags, you’re able to know for which workloads you need to have enhanced networking or memory optimized instances. When you choose optimized instances for your needs, you might end up needing fewer servers.

To Apply Restriction Controls

After you’ve seen the benefits of using tags, you might want to make sure you don’t have untagged resources anymore. Well, you can enforce tags and their value every time a new resource is created. It’s always better to receive an error than to have to find out later who created the resource and why. Enforcing tags is not a well-supported feature in all cloud providers, meaning that you might not receive descriptive errors. For that reason, consider having a checklist document, templates, or pre-built catalogs. Another use case around restriction is that for some resources you can grant permissions based on tags. For example, you could use a tag to identify which project the resource belongs to. Then, you create a policy permission to restrict access based on tags. When a user has these types of policies attached, it’s going to be impossible to have access to resources other than what the project requires. A developer for the consumer site won’t have access to backend services. Using tags for restrictions is an easy way to create generic policy permissions.

Tags Are Your Infrastructure Metadata

Tags are a straightforward concept: a key/value pair that you use to identify a resource. But as I described in today’s post, tags can become your best ally when managing your cloud resources. Tags don’t get coupled with the applications you’re running in a server; you can always change their value to any existing infrastructure. You could still change your mind about your tag strategy, but that shouldn’t mean you need to reprovision or restart services. In the past, a good naming convention was important. You knew a lot about a resource just by reading its name. It’s still essential to have a good naming convention, but sometimes just a name isn’t enough; you need to have more context and to reduce manual labor. Tags are a perfect complement when you want to leverage work on automation. Don’t underestimate tags because of the simplicity of the concept; they’re a powerful tool.

Contact Us

Christian Meléndez This post was written by Christian Meléndez. A regular poster for Enov8, Christian is a technologist that started as a software developer and has more recently become a cloud architect focused on implementing continuous delivery pipelines with applications in several flavors, including .NET, Node.js, and Java, often using Docker containers.

Relevant Articles

DORA Compliance – Why Data Resilience is the New Digital Battlefield

0 Comments

How Enov8 Helps Financial Institutions Align with the EU's Digital Operational Resilience Act Executive Introduction As of January 2025, the EU's Digital Operational Resilience Act (DORA) has become legally binding for financial institutions operating across the...

Data Fabric vs Data Mesh: Understanding the Differences

0 Comments

When evaluating modern data architecture strategies, two terms often come up: data fabric and data mesh. Both promise to help enterprises manage complex data environments more effectively, but they approach the problem in fundamentally different ways. So what’s...

What Is Release Management in ITIL? Guide and Best Practices

0 Comments

Managing enterprise software production at scale is no easy task. This is especially true in today’s complex and distributed environment where teams are spread out across multiple geographical areas. To maintain control over so many moving parts, IT leaders need to...

Test Environment: What It Is and Why You Need It

0 Comments

Software development is a complex process that requires meticulous attention to detail to ensure that the final product is reliable and of high quality. One of the most critical aspects of this process is testing, and having a dedicated test environment is essential...

PreProd Environment Done Right: The Definitive Guide

0 Comments

Before you deploy your code to production, it has to undergo several steps. We often refer to these steps as preproduction. Although you might expect these additional steps to slow down your development process, they help speed up the time to production. When you set...

What is Data Tokenization? Important Concepts Explained

0 Comments

In today’s digital age, data security and privacy are crucial concerns for individuals and organizations alike. With the ever-increasing amount of sensitive information being collected and stored, it’s more important than ever to protect this data from...