High-frequency Systems Management: for ops teams that don’t communicate with smoke signals

High-frequency Systems Management: for ops teams that don’t communicate with smoke signals

By | 2018-04-17T19:00:47+00:00 July 4th, 2015|Blog|

The magnitude of change impacting IT Systems Management is comparable…in the past decade. And yet we still manage apps and infrastructure like it’s 1792. Hey bar wench, pour me some mead in a wood jar to enjoy with my salt cod while I stare at blinking lights in the NOC then manually patch a server.

It gets worse. Just like we don’t use a telegraph to place trades or an abacus to tally margins we certainly shouldn’t rely on manually-maintained libraries of rules to make sense of noisy alerts. Today, savvy traders use sentiment analysis and high-frequency algobots to analyze levels of activity far too great for humans to comprehend. A dwindling number of traders still place trades manually (and those that do end up bankrupt).

Similarly, many ops teams still manage infrastructure one alert at a time with systems like Nagios, Sentry, and New Relic while smart ones automatically correlate alerts with impact and impact with action. Our noisy alert problem was caused by complex infrastructure, heterogeneous monitoring environments, and a growing volume of critical apps.

Alerts ooze from every orifice of the modern datacenter: APM and NPM systems, databases, IDS and IPS frameworks, firewalls and routers. Every service in every cloud, virtual machine, and container is constantly monitored. What’s more, downtime is more expensive than ever as every company becomes a software company.

The volume of machine data that must be analyzed has increased by orders of magnitude in the last 18 months alone. We’ve gone from the equivalent of carriages to Teslas yet in IT we still prefer horses. Insanely, we “solve” machine-generated problems by hiring more people to stare at more blinking lights.

High-frequency systems management requires a different approach – one aided by algorithms and improved using assisted machine learning where software recommends remediation procedures and human operators approve or reject recommendations.

Systems can be managed. Noisy alerts can harmonize. Companies can rely on apps. But only when high-frequency systems are managed by high-frequency management platforms. Don’t believe me? Try attaching that next 100-share limit order to the leg of a carrier pigeon… then call me by smoke signal and I’ll sell you my chamber pot and scabbard.

Thanks to Mike Maples (@m2jr) from the Floodgate Fund for suggesting the connection between modern Systems Management and high-frequency trading. You’ve inspired us to beat the markets once we solve the noisy alert problem.

About the Author:

Dan Turchin was BigPanda's VP of Product. Follow Dan on Twitter: @dturchin. Connect with Dan on LinkedIn.