Enterprise Release Management Best Practices
by Kathrin Paschen
Managing releases at scale is daunting. It involves juggling dependencies, timelines, and requirements. The stakes can be pretty high, too. Not all failures are as spectacular as crashing a lunar lander or losing $440 million. Even so, investing in enterprise release management makes a lot of sense.
Enterprise release management orchestrates the release processes of several projects and enables you to launch or update large software systems. When it’s done well, it looks easy (though you can be sure people are working hard to make it so). When it goes bad, it feels like a nightmare.
This post provides practical tips for how to manage releases at enterprise scale, successfully.
What Does Success Look Like?
We’ve all seen releases that went poorly, and we can recognize some of the red flags that a release management system is failing. But what are your criteria for a good release?
Usually, the answer to this is “a good release is one where we launch on time with no big problems.” But there’s often a trade-off between speed and quality. Also, maybe you also want to include “the new feature makes our customers happy” and “nobody had to work a 100-hour week” in your criteria.
There’s a difference between a release and a feature launch. A feature launch is when a new feature becomes available to customers, but a release just means you’ve set some new code live. You can (and probably should) release regularly and often, no matter the size of your company. Small, frequent releases are great for quick feedback and innovation, and they make large feature launches easier.
Different companies choose different trade-offs:
- Startups often want to launch quickly to maintain momentum. Therefore, they may be willing to take risks when it comes to releasing code with a few known bugs.
- Big companies, on the other hand, often require new functionality to integrate well into their existing offerings. They may have legacy systems. Also, they tend to be reluctant to expose customers to bugs or downtime.
The main difference between startups and larger companies is in the amount of coordination—releases at larger companies typically require more of it.
Below, I describe best practices that help in different kinds of environments, from startup to big company.
Best Practice 1: Plan and Communicate
When coordinating project release schedules toward a feature launch, you often need to plan backward from a target date and align the work so everything is ready on time. Even when you’re not working toward a feature launch, you’ll have project dependencies.
This means you need an overview of…
- the projects that contribute to the release
- the features that they need to deliver
- how those depend on each other
For example, let’s say one project provides a feature that others need. In that case, the first project needs to deliver that feature early enough for integration and testing. Similarly, if two projects access the same database tables, then schema changes have to be coordinated between them.
A diagram of these dependencies is essential; software that helps you track them is even better.
The next step is to define a timeline (a release calendar is useful here) that says which project needs to deliver which feature by which time. Include time for testing and for fixing bugs, and leave some slack because you always need slack.
Almost as important as having a timeline is making sure every project team knows about the release calendar and has agreed to it. If a project team pushes back, be sure to listen and address their concerns.
Best Practice 2: Manage Dependencies
The more dependencies there are between the projects in your release, the higher the risk of delays and late-breaking bugs. Some dependencies are unavoidable; you track them and reserve time for integration testing.
Whenever possible, encourage projects to reduce their dependencies. Ideally, project teams create self-contained systems that communicate with each other through clear application programming interfaces. This decouples project teams and allows them to develop and test against mocks or fakes. It also reduces long lead times and blocked user stories—two of the red flags mentioned above.
I find it useful when each project team creates tested and versioned packages of their software that other projects can use. Why bother doing this?
- It helps the other projects test against a versioned package. Most package repositories also allow you to label packages. For example, you might use the labels “Ready for testing” or “Use this only if you want to reproduce bug #123.”
- It helps testers reproduce bugs. If bug reports contain package versions, then the testers can be reasonably sure that they’re testing the right code.
- It helps you track progress. Most task or issue tracking systems allow you to record which software versions include a feature. You can then track whether a package with a feature exists, whether someone has tested it, how many bugs they found, and so on.
Now you can do a fair amount of integration testing early, avoiding the risks around “big bang” integration. That type of integration happens when people on individual projects write their software in isolation and then put everything together late in the process.
Best Practice 3: Remember That Tools and Automation Are Your Friends
Wherever possible, automate steps, such as building, testing, and deployment. This saves you time and reduces the opportunity for errors. Excellent tools exist.
What if a step is too hard to automate? Document it. People who aren’t on the project team have to be able to build, test, and package a new version. This is essential if you ever need to deploy a security update quickly, for example.
If at all possible, deploy regularly. A few projects can’t do this—for example, those whose software gets flashed onto ROMs. However, even if you can’t deploy to your end users regularly, deploy to a test environment. Benefits include:
- You exercise your deployment tools regularly. They become familiar, and you can fix issues with them.
- You find issues that don’t show up on developer machines. Maybe the code makes assumptions about directory structure? About installed libraries? Maybe the code assumes the database runs on the same host?
- You need to run performance tests at some point, and it makes sense to run them on a test environment similar to your production environment. Can the system achieve the necessary throughput? How does it behave under high load? Does it degrade gracefully, or does it go into cascading failure?
For big launches, I recommend hiding the new code behind a feature flag. This cuts down on branch maintenance. Feature flags also provide you with a built-in OFF switch in case you need to back out of the launch.
In addition to automating your releases, I recommend automating policy and compliance checks, even security tests. This is no replacement for audits, but it helps.
Best Practice 4: Track Project Status
If you’ve ever been involved in an enterprise release, you will have attended status meetings. They tend to be dreadful. If you have a way to track at least some aspects of the various projects’ progress without making people sit in a meeting, you can free up a lot of time and avoid people getting grumpy. Instead, you can focus on important points:
- Which projects aren’t on track?
- What’s holding them back?
- Who can help and how?
Identifying risks early provides more time for mitigation. Moreover, you can now replace the large, time-consuming status meeting with smaller meetings focused on solving problems.
You need to have a way for projects to publish their status. How much of the work is complete? How are they doing on testing? There should be a way for projects to update their status, either manually or automatically. Also, there should be a way to see project status on a dashboard. Best practices 1, 2, and 3 provide you with tools that make this easier.
Best Practice 5: Observe and Improve
This topic overlaps, not surprisingly, with site reliability engineering best practices. You need to notice when something isn’t working well and then do something about it.
Deployments and infrastructure changes are a leading cause of IT system failures. It makes sense to observe your systems, both in your test environment and in production. Use monitoring tools.
Usually, when an issue shows up shortly after a release, people will blame the release. They’ll often be correct, but it’s good to make sure because the mitigation strategy depends on it. If you have good monitoring data and a record of which software changes went into which version, you stand a much better chance of pinpointing the cause of a regression.
What if you decide the release is to blame? You have three options.
- Roll back the release.
- Fix the issue in production.
- Leave things as they are.
Depending on your systems, not all of these options may be workable. With systems that do frequent small releases, rolling back is often possible. Ideally, your projects are sufficiently independent of each other that you can roll back just some of them.
Sometimes, rollback isn’t an option. Maybe you’ve got database schema changes that can’t be undone. Maybe the release is a highly visible feature launch. The risk evaluation that goes into deciding between “fix in production” and “leave it as it is” is beyond the scope of this post. At that point, an enterprise release manager can only advise on how to test and roll out a fix quickly but responsibly.
Monitoring can also tell you whether and how customers are using your features. This is valuable feedback and addresses another release management red flag, as mentioned earlier in the post.
Summing up: Experts Can Help You Implement Best Practices
Enterprise release management spans projects and disciplines. It requires skills that range from project management to engineering. Also, it includes a lot of cross-team communication.
All of this may seem daunting! But with the right set of tools and best practices, it’s entirely feasible. Enov8 offers an enterprise release management tool to help you implement best practices. Moreover, its specialists are experienced at guiding enterprises through the adoption of release management best practices. Here’s how you can learn more.
This post was written by Kathrin Paschen. Kathrin is a freelance SRE interested in capacity planning, cost estimation, and monitoring. After a long time at Google, she now has her own small company focused on helping clients use the cloud. She likes figuring out scaling bottlenecks and resource models for cloud architectures.
15JULY, 2021 by Justin ReynoldsCompanies go to great lengths to protect their physical environments, using deterrents like locks, fences, and cameras to ward off intruders. Yet this same logic doesn’t always translate to digital security. Corporate networks — which...
06JULY, 2021 by Justin ReynoldsCompanies today face increasing challenges around reducing the time and cost of software development. Many are thus using DevOps methodologies, which combine software development and IT operations to achieve continuous delivery and...
24JUNE, 2021 by Omkar HiremathInformation technology and the digital world don’t exist without data. The data of an organization can contain a lot of unclassified, as well as classified information. Irrespective of that, only authorized personnel should have access to...
28MAY, 2021 by Sasmito AdibowoThe benefits of using cloud environments to store and access data over the Internet has been highly beneficial for many businesses. Cloud environments help both start-ups and enterprises scale up conveniently. However, as with other major...
10MAY, 2021 by Eric GoebelbeckerImagine a technology that lets you focus on your business logic and that takes care of issues like reliability and scaling for you. What would it be like if you only had to pay for the computing time you use rather than pay by the day,...
21APRIL, 2021 by Zulaikha GreerWhat Is Privacy by Design? Millions of dollars go into securing the data and privacy of an organization. Still, malicious attacks, unnecessary third-party access, and other data security issues still prevail. While there is no definite...