BigPanda blog

RESOLVE ’22: Incident management automation

RESOLVE ’22: Incident management automation

“Make life easier” isn’t a mantra for the lazy—it’s a way to drill down on important automation in the IT Ops room.

When Ryan Taylor, VP of solutions engineering at Transposit, talks about his experience and outlook in the IT Ops chair, people tend to listen. He’s run the show for companies like Intuit and Hulu, and our CTO and Co-Founder Elik Eizenberg refers to him as “one of the smartest people I know.”

And when it comes to automating and reducing complexity in the average IT Ops environment, Ryan’s instructions are clear: look for ways to make your own life (and the working lives of your teams) easier.

That was the key takeaway from Ryan’s talk at our RESOLVE ’22 extravaganza: Incident management automation. He and Elik spent a substantial amount of time preparing their mentalities and their technical infrastructure for the sort of slow-drip change they need to make lives—and business—easier for everyone involved in reporting, addressing and resolving IT incidents.

Read on for more analysis of the event. There will be a link to the full recording at the end of this analysis for readers who wish to take even more away from Ryan’s insightful comments.

A micro-approach to troublesome microservices

Ryan’s “make life easy” approach came through loud and clear in an anecdote he shared early in the conversation. He said he learned his approach to automation in part from a group of newly hired employees who “had a passion for making their own lives easier.” Their natural disdain for inefficient, repetitive or otherwise unoptimized tasks led the team to tackle something like “1,200 microservices” that had complexified the operation in ways impossible to predict over time.

It’s a great approach, Ryan said, because the notion of “we need to automate”—typically handed down by higher level leadership without full knowledge of the nitty-gritty backend processes—is often too vague to be actionable.

“Automating doesn’t require that style of leadership from the top down. It doesn’t require an executive coming in the room and saying ‘We’re gonna solve hunger; we’re gonna automate all the things,’” he said.

By contrast, a company that chooses to “automate all the things” by naturally tackling points of friction and frustration tends to realize better results over time. AIOps platforms like BigPanda’s speed this process dramatically, but that idea still rests at the heart of the improvements we help instill.

To that point, both Elik and Ryan noted that the unrealistic “Holy Grail scenario” of automating all of a workplace’s most complex tasks at once can be problematic because it precludes a company from the “make life easier” outlook.

Elik said, “There are many forms of automation that drive a ton of value that aren’t as complicated.” And those should be the starting point for most any company looking to make more serious use of automation, however big their long-term plans may look.

Automation encourages reskilling, not just replacement

From there, the conversation went to a topic for which (as Elik said) “there are no easy answers”: the notion that effective automation on one side typically turns to lost jobs on the other.

This is a common concern in the broader IT world and something managers wrestle with when considering tools that would—despite this worry—automate away significant areas of pain or waste for the business. And while a small degree of “automation replacing” is, quoting Ryan, “an inevitability”—a much higher percentage of people (and their associated roles) can upskill to something that supports the ongoing automation effort.

Tech-related fields/roles tend to find people “with a passion for problem-solving,” said Ryan, which gives stakeholders a great starting place when deciding what skills and certifications must evolve.

“I think automation is inevitably going to change the workforce,” he said, speaking generally to a phenomenon we’ve seen hit time and again across various industries from manufacturing onward. And turning to the technology side of things, he said: “But I don’t know that it’s necessarily going to cause a great downsizing in ops-focused folks. I certainly think it’s going to be a reskilling or learning opportunity for a lot of people, though.”

With his mature perspective on advanced automation and its impact on IT Ops workplaces, Elik was able to add even more perspective to the discussion. In an evolved environment, he said, the same stakeholder concerned about attrition through automation might say to themselves, “My team used to manage incidents. Now they’re managing the bots that manage the incidents.”

“How can you turn your team into a highly scalable group that can handle any potential scale of incidents, any potential scale of manual work?” He said. “If you start early to train your team around doing that, that’s probably the best strategy to retain your team and drive the business outcomes you’re supposed to drive.”

What happens when company automation goals level up?

Ryan spent a lot of time talking about approach at the discussion, and one aspect of the approach he described could be compared to leveling up in a video game. “After you automate the small things,” he said, “your automation candidates become increasingly more difficult.”

The larger and more complex the task being automated, the more specialized—or at least difficult to adapt to other processes—it becomes.

This is where ideas like templatization and normalization become useful, our speakers said. By investigating the sources of issues and automating away as much of the process as they can—even if they can’t automate the entire incident or process away—companies give first responders a clearer, faster path to resolve the aspects that do require human intervention.

If, over time, the company figures out a way to automate the whole reporting and response aspect of the common problem, it’s a great development and a natural evolution of a growing automation focus. And if not, at minimum, the task will always be less cumbersome and time-consuming when automated.

On the other hand, training automation tools to pull information, alerts and other data from disparate sources—even if those sources don’t work together—can provide a degree of data normalization companies find very useful.

These points combine to create a business environment where companies must “build their single source of truth or be able to feed from multiple sources,” Ryan said. On the reporting side, being able to pull information from multiple sources and turn it into viable intel is a considerable benefit. And on the learning side, an IT Ops automation initiative can only draw insights from the sources to which it is linked. If the “single source of truth” has gaps and blind spots, critical insights may never be uncovered.

See the full talk

A full link to our RESOLVE ’22 talk, Incident management automation, can be found here.