Any post about problems like downtime, outages, missed SLAs, etc.
Ask yourself these questions to find the right fit in an alert correlation platform.
To maintain operational visibility in modern IT environments, companies are abandoning monolithic monitoring solutions from legacy vendors in favor of a modern set of “best of breed” monitoring tools. Today’s average IT monitoring stack consists of about 6-8 tools, including at least one from each of the following categories: systems monitoring, end user monitoring, application performance monitoring (APM), error detection, log analytics, chat, and ticketing. When service disruptions occur, operations engineers face a flood of alerts across different layers of the IT stack, with no fast way to figure out what’s really going on. Customers are left stranded, while IT professionals struggle to detect, triage and remediate urgent issues. Downtime abounds which negatively impacts revenue, performance, and brand loyalty.
We all need to move fast in order to stay competitive. But the faster things move, the faster things break.
While many companies have made great strides towards automating application release and infrastructure management, automation for service assurance has been sorely lacking. That’s left Dev and Ops with a problem: how to effectively service alerts that have grown by orders of magnitude.
Sam’s a father of two boys living in the bucolic LA suburb of West Covina. He’s a family first guy who paints model military cargo planes for fun, makes award-winning paella, hates his commute, and loathes his phone between the hours of midnight and 4:00 AM.
Sam was a kid when he joined News Corp as a help desk analyst in 2000. More than 15 years later and he’s now Sr. Director of IT managing a growing team of 30 NOC engineers, sys admins, and DBAs. Over the years, he has received more promotions than Trump on his own Twitter feed by delivering results and never wavering from two core beliefs that influence everything he does: