DevOps and TEM Go Hand in Glove
by Mark Henke
DevOps is overall a healthy practice for most development teams, but it doesn’t come for free. Enterprises are eager to adopt the practice but their tools often lag behind DevOps practices. This is a bit like walking out into the winter cold with bare hands and being overwhelmed by Mother Nature’s bitter force.
Like hands and gloves, DevOps and test environment management (TEM) are a practical pair. In this post, we’ll summarize what DevOps and TEM are, and with what specific practices they complement each other.
DevOps: The Hands
DevOps is about marrying operations and development to achieve full software ownership. Dev teams are taking responsibility for code after it deploys into all of their environments. This gives us great power and autonomy.
However, in an enterprise, we often have many dependencies—other software systems we rely on. And managing all of these environments gets increasingly difficult. In the current landscape, managing these dependencies with DevOps is a bit like walking into cold, wintery weather with no gloves on. Without proper tools, our poor, operations-soaked hands will get frostbite. This frostbite comes in the form of time we must spend on production support and operational activities. And this time steals away our capacity for new development and numbs us to the joy of writing code.
TEM: The Gloves
Test environment management (TEM) is the set of gloves we can put on to protect us from the bitter costs of supporting our software. This post is a good primer on why TEM is so important for DevOps. Here, we are going to cover a few specific cases where it’s valuable.
The Chilling Wind of Black-Box Testing
Black-box testing is a way to test systems without deep knowledge of how they work. It’s described as a black box because your dependencies are opaque to you. You know what goes in and what should come out, but you don’t care about what happens in between. Types of black-box tests include contract testing, end-to-end system testing, and smoke testing.
Unlike unit tests, these verifications can’t run in isolation. It can be hard to set up certain data the way we need it or ensure a clean slate every time we run the test suite. Data from one team can collide with another team’s data.
The environments we’re testing can also be unstable or unreliable, changing on a cadence that constantly disrupts the teams that use it. These sorts of tests are like a chilling wind that blows through your environments, causing flaky, inconsistent failures and false positives. Without TEM, the cost of these tests can outweigh their benefits.
When we treat test environments as first-class citizens of our software suite, we can mitigate these problems. When we can manage test environments explicitly, we can cohesively change them in sync so that teams know what to expect. We can script these environments so that our consumers can customize them or manage them directly. And we can partition our environments properly so that teams won’t collide with each other.
The Snow Drift of Performance Testing
Another area that can pile up on us without TEM is performance testing. It’s prudent to ensure that we can meet our service-level objectives, but we don’t control the full stack.
It’s traditionally difficult to control or even understand the total performance you can provide. Your system is just one of many, each trying to meet customers’ latency needs. Not only is this tricky, but performance testing puts a lot of stress on systems. You can unintentionally stop another team from running their own suite. Or they may be deploying, causing your metrics to act abnormally.
Test environment management ensures performance tests are cohesive. You can manage the timing and cadence of these test runs together and across systems. Or, you can self-service your dependencies and spin up the environments yourself for testing. We’ll dive into this later. TEM is a snow shovel you can use to clear your way to smooth metrics on your performance.
The Icy Road of Tracing and Logging
When practicing DevOps, the ability to quickly track down and fix weird, subtle problems is paramount to strong product support. This is a hard task with a flood of immature products out there that don’t quite enable us to do this.
There are two main tools we have to combat this: logging and tracing. Tracing helps us pinpoint problem spots and logging helps us diagnose why those spots cause problems. But in a complex, distributed system there are often hurdles to effectively using these tools.
Loggers may be inconsistent across applications. Or they may be stored in disparate places. Imagine getting a critical problem at 2 AM and having to click through and download half a dozen log files from four different places! Or imagine trying to walk through a request across seven different systems of which you only have a few admin dashboards to guide your way.
These types of problems result in multi-hour phone calls with a dozen different people, all trying to stitch together an elaborate puzzle. The unknowable types of incidents that plague us on production support are like an icy road. We travel down it trying to reach our home, but keep sliding out of control and often end up in a ditch.
The first step is for us to have consistent, aggregated logging across all applications. The second is to implement some distributed monitoring tooling.
But how do we ensure these tools are doing a good job? Enter, once more, TEM. TEM will be the salt on our road, letting us see across systems how we can get these tools into place. We can also use our environment management to consistently bolt these things on. Then each application team need not worry about how to bring these tools to life.
The Inner Lining of Self-Service
In all these different areas where TEM warms our hands, a common theme emerges. There’s an inner lining in our gloves. Treating test environments as first-class concepts lets us move away from team hand-offs into the warmth of self-service.
Traditionally, teams were very restrictive of their code. Over time they’ve become less restrictive, but many teams still hold onto their tools, their data, and their deployments. To get anything done, you would often need to schedule things in advance with your downstream teams.
On the other side, you would need to hand off completed work for someone else to actually bring it to customers. Each hand-off increases the lead time for a feature and becomes a crack for potential miscommunication. These miscommunications develop into defects.
With ideas like TEM, we realize that our environments are often as important as our code! We thus learn to package and automate it as much as we automate our application. Doing this opens up the door for us to easily share this with others. We can publish our test environments. With strong tooling, we can maintain these explicitly. Then developers are able to self-service the test environments they need for performance testing, black-box testing, and the like.
Now developers who depend on our team don’t need to wait for us. They don’t need to hand something off to ensure it works end to end. By managing and publishing test environments, we can make features faster than ever with the highest quality.
Braving the DevOps Arctic With TEM
While DevOps gives us the autonomy to own our software, it can be a cold landscape. Many of the old tools are no longer sufficient. We need to warm ourselves with gloves that let us treat our test environments as first-class citizens. We’ll find that handling complex testing and monitoring scenarios with our dependencies will warm our entry into the cold.
Sometimes, though, even gloves don’t provide enough warmth. Sometimes we need a space heater for our TEM. That’s where a tool like Enov8 can come in.
This post was written by Mark Henke. Mark has spent over 10 years architecting systems that talk to other systems, doing DevOps before it was cool, and matching software to its business function. Every developer is a leader of something on their team, and he wants to help them see that.
We often get asked by people “What is TEM (Test Environment Management), well for those of you looking for a quick overview of Test Environment Management, here is Use Case we developed as a way…
19 MARCH, 2020 by Michiel Mulders SRE vs DevOps: Friends or Foes? Nowadays, there’s a lack of clarity about the difference between site reliability engineering (SRE) and development and operations (DevOps). There’s definitely an overlap between the roles, even though...
06 MARCH, 2020 by Arnab Roy Chowdhury Top 10 SRE Practices Do you know what the key to a successful website is? Well, you’re probably going to say that it’s quality coding. However, today, there’s one more aspect that we should consider. That’s reliability. There are...
20 FEBRUARY, 2020 by Arnab Row Chowdhury Technically, the world today has advanced to a level we never could’ve imagined a few years ago. What do you think made it possible? We now understand complexities. And how do you think that became possible? Literacy! Since...
14 FEBRUARY, 2020 by Michiel Mulders A site reliability engineer loves optimizing inefficient processes but also needs coding skills. He or she must have a deep understanding of the software to optimize processes. Therefore, we can say an SRE contributes directly to...
07 February, 2020 by Arnab Roy Chowdhury Do you remember what Uncle Ben said to young Peter Parker? “With great power comes great responsibility.” The same applies to companies. At present, businesses hold a huge amount of data—not only the data of a company but also...