DevOps vs. DevOps at Scale
by Carlos “Kami” Maldonado
“DevOps at scale” is what we call the process of implementing DevOps culture at big, structured companies. Although the DevOps term was coined almost 10 years ago, even in 2018 most organizations still haven’t completely implemented the practice.
DevOps used to be most popular at start-ups. After all, it’s easy to introduce processes that break the rules when you had no rules to begin with. But this has left large corporations and bureaucratic organizations to watch DevOps from the sidelines.
Now, things have finally changed. If we check the numbers from the 2018 edition of DORA’s Accelerate: State of DevOps Report, it’s evident that big players in the industry are actively implementing DevOps practices.
Could you do this at your organization? When applying DevOps at scale, you’ll face a collection of hurdles. Let’s look at how to overcome some of them.
Make the Bottom Line Your Top Priority
Large organizations may be leery about implementing DevOps culture out of fear for their bottom line. With start-ups, their value often depends on creating a new niche, maximizing their user base, then selling to a bigger entity. But older organizations are usually more focused on long-term goals. Therefore, they won’t take risks that could jeopardize revenue. The pace of change within their information technology (IT) department is usually slower than at a dynamic and nimble start-up, which can adapt quickly.
Even determining how an IT platform change has affected the bottom line can be tricky. Non-IT departments such as sales and marketing might have different metrics and reporting frequency, so agreements with other business units might be necessary. For example, if IT conducts reporting and planning every two weeks, and the sales department does this every month, it’ll be hard to pinpoint what changes might have influenced other business metrics.
You also need to know how each key performance indicator (KPI) affects indicators in other business units. This will let you make correlations between IT performance and your organization’s core activities. Follow the money and get relevant data, so your DevOps implementation helps your organization become more successful and profitable.
Stay Consistent Across Environments
A built-in advantage of start-ups is their reduced head count. There’s usually only a single team handling every major infrastructure, technology standard, and naming convention. To communicate new rules or policies, you can simply send one group e-mail then get back to working on code.
But when large numbers of people work in isolated groups, some anti-patterns tend to develop. For example, teams may use different words for common elements in a corporation, KPIs will start to change across teams because of differing goals, and server hardware may vary from one environment to another.
Inconsistencies introduce uncertainty in each phase of your code’s life cycle. For instance, “lead time” might mean different things for developers, operators, and marketers. Standardizing terms and technologies will reduce mistakes and the need to rework. It’ll also make things more transparent when capturing, analyzing, and learning from data. And when data and procedures are consistent, they’re useful for everyone.
Communicate Among Teams
Again, because start-ups are small, their teams can usually all fit into the same small office. Some employees might even work in coffee shops or from home and get together once a week for brainstorming sessions. Because of this intimate and casual work environment, staff can usually discuss, and agree on, an idea within minutes.
At large corporations, it’s harder to spread a message effectively. An idea needs to be aligned with shareholder interest and approved by a manager in the C-suite. In addition, clear, careful communication is essential to avoid confusion and opposition.
To enhance organizational culture, it’s essential to have transparency, trust, clear reporting, and blame-free post-mortems. Supporting these values from top to bottom helps everyone at the company feel confident and safe that their contributions will be respected and that peers won’t perceive them as threats.
Leaders need to help their teams communicate more substance and less noise. Managers must push a message clear enough to keep teams aligned with core business goals, instead of getting distracted by every new technology available.
Implementing DevOps at scale with geographically distributed teams introduces an additional layer of complexity. Time zones, varying degrees of language proficiency, and cultural differences all create barriers to collaboration. In addition, trust and influence are harder to build when people aren’t meeting face to face. That’s why it’s so important to invest the resources in getting members of distributed teams together a few days a year. These meetings will help them build personal bonds that will improve their trust in each other and increase their efficiency as teams.
Reduce the Development Cycle
Using shorter sprints in your software development cycle helps developers get feedback more quickly from users. Most start-ups are born with this practice in their DNA. However, developers at bigger companies will shudder at this prospect and assume they’ll have to go through the dreaded code integration phase more often.
Reducing your development cycle times also means code conflicts and regression errors might happen more often. That’s why modifying your sprint lengths isn’t enough when implementing DevOps at scale. Application code should be decoupled from its dependencies throughout its life cycle, so when there’s a breaking change in some related code, it doesn’t block other people from moving forward.
Implementing smaller changes in code and using feature branches and topic branches are popular solutions you might want to consider. Otherwise, you might end up with many small waterfall phases instead of a continuous deployment pipeline.
Automatically Enforce Requirements
There’s usually a “Doctor No” in every big organization—often one or more engineers who serve as production gatekeepers and frequently veto suggestions. ITIL literature calls them the change advisory board (CAB). They make sure deployments are compliant with requirements. CAB processes usually involve meetings and lots of time from developers.
As the software release process evolves, Doctor No’s role will experience deep changes. You can facilitate this by helping CAB members convert their requirements into reproducible automatic tests. Then, you can use these tests as a resource in your continuous deployment platform. Put all this on trial with small projects that don’t affect revenue.
Automating releases will make deployments safer and remove bottlenecks. As a bonus, you’ll see less friction between people when a system enforces code compliance. If you still require a manual step, it could be reduced to people checking a box.
Have Developers On Call
It’s common for start-ups to involve developers with live issues in production. However, putting developers on call in big organizations is no easy feat. People who weren’t hired with the expectation of off-hours availability might not be pleased. And finding developers committed to an on-call role also means paying them higher salaries.
If you currently have only operators on call, you’ll need to modify your escalation and hand-off procedures. Offer your staff some form of compensation for unexpected overtime. Some people prefer money, while others prefer paid time off. In addition, some organizations provide laptops, phones, and other resources to improve on-call incident response time.
Make your on-call staff’s efforts visible by measuring their outcomes with KPIs like “Time to First Response” and “Time to Restore Service.” Keep iterating with new data from each incident to improve the process, and your developers will produce code that fails less often. And when it does, it will produce clearer error messages.
Implement DevOps With Confidence
Every organization will face different challenges when implementing DevOps at scale. When you do, I recommend keeping these important items in mind:
- Stay focused on providing value to your business.
- Capture relevant metrics to validate your hypothesis.
- Reward consistency, transparency, and software quality.
- Focus on outcomes, perform small changes, and measure again.
- Build in small wins by automating deployments with low risk, and learn from them.
- Remember that driving people through change takes time. Listen to them, and keep iterating.
Carlos “Kami” Maldonado
Enov8 blogger, Carlos, is an engineer helping his company transition to DevOps. He specializes in Linux automation, and he’s experienced in all layers of infrastructure, from the application layer down to the cable. He’s part of a team migrating a monolithic app from static VMs to on-premises Kubernetes deployments.
01 JULY, 2020 by Diego Gavilanes Ever since the dawn of time, test environments have been left for the end, which is a headache for the testing team. They might be ready to start testing but can’t because there’s no test environment. And often, the department in...
29 JUNE, 2020 by Carlos Schults In today’s post, we’ll discuss data literacy and its relevance in the context of GDPR. We start by defining data literacy and giving a brief overview of GDPR. Then we proceed to explain some of the challenges organizations might face...
23 June, 2020 by Arnab Roy Chowdhury In this digital era, online businesses have become mainstream. Consequently, online commerce has flourished—and led to loads and loads of data! Businesses need to build data centers to store information. Not only that, but if you...
08 JUNE, 2020 by Eric Boersma Every company needs a disaster recovery plan. This is just a simple fact of life. Your company needs to know how to recover when something breaks or you can’t get access to something you need. In larger, more advanced tech companies,...
25 May, 2020 by Daniel Longest Zombie and ghost assets sound exciting, like a late-night movie you’d watch around Halloween. While in reality they may not be that exciting, they’re scary if you don’t understand and prevent them. The good news is the steps you need to...
05 May, 2020 by Eric Boersma Taking on Site Reliability Engineering (SRE) is not an easy task. It doesn’t matter where you’re coming from. Some organizations have done a little DevOps and are trying to break into SRE. Others haven’t even taken that step, and figure...