BigPanda Event Enrichment Engine: The secret ingredient for AIOps

7 min read
Time Indicator

James Beard, the pioneer of television cooking shows, once asked, “Where would we be without salt?” Salt is often overlooked, but it has a significant impact on food and flavor today. It has its own distinct taste, yet also balances and complements the flavors of other ingredients. Salt enhances sweetness and reduces bitterness. It has been scientifically shown to boost flavor compounds that are often too subtle to notice, “bringing out the taste in food.”) and it makes meat juicier by unraveling its strands of protein and allowing them to absorb liquids. 

I bet you didn’t know this, and maybe don’t really care now that you do… but you should. Because, as you now understand, salt has the biggest impact on what sustains you and keeps you going – your food.

Which brings us to event enrichment. 

Event enrichment has a significant impact on what helps modern IT Ops organizations like yours succeed – your AIOps solution. Event enrichment is often overlooked and misunderstood. However, it is a key part of successful IT operations, especially in AIOps. Here’s why.

 

Structure as the first step

It’s been said dozens of times before – so one more time won’t hurt: Today’s IT Ops teams are drowning in a sea of alerts emanating from their many monitoring and observability tools. But it’s not just their number that’s an issue – it’s also that they lack context. Bombarded with tens or hundreds of thousands of context-less alerts, many IT Ops teams have reached the point where they’ve effectively “shut down ” their monitoring: They’ve stopped using their monitoring tools for proactive monitoring, rather only going back to them after-the-fact, and using them as a diagnostic tool when something breaks and they’ve been notified. 

For many of them, the way to escape this tough situation is through enrichment. This can help them return to proactive monitoring.  

By adding context to events, teams can group them based on their connections. This includes the app they relate to, the server they run on, and the business or service affected. This helps manage the large number of alerts in IT operations. Now, operators can focus on groups of related alerts. 

It is also a crucial first step towards achieving success with AIOps.

 

Correlation and root cause analysis

Adding structure helps us understand the basic context of events. However, it does not help with the large number of events or finding the root cause. To help teams detect and resolve problems more effectively, AIOps tools must sort through a large number of alerts. They should turn thousands of alerts into a few clear, useful incidents. 

The ability of AI/ML to detect correlation patterns and “compress” alerts relies heavily on the quality of the data fed to it. Context-less data leads to limited, low-quality incidents as a result of weak correlation. Enriching alerts collected from all aspects of your IT stack or technology domains – aka cross-domain enrichment – supplies the AI/ML algorithms with the information they need to correlate alerts with a high degree of efficiency, effectively reducing IT noise to humanely-acceptable levels.

Successful root cause analysis depends on understanding the different connections between infrastructure and application components in today’s environments. Some of this information is hidden in incoming alert streams. Additional information is sourced from external data sources. These sources include asset and inventory management systems, orchestration tools, APM services, flow maps, CMDBs, and other related tools. Cross-domain enrichment adds this much-needed context. By doing this, it reveals the root causes related to infrastructure. It also helps connect incidents to the changes that are causing problems and outages.

 

Automation

Now that you’ve identified and surfaced the root cause of an incident, you need to respond to it, remediate it, and eventually resolve it. 

Enrichment drives value here as well: it ensures the creation of incidents whose payload data consists of all the right topological, operational and other contextual data needed to drive downstream automations and workflows – both inside the AIOps tools, and inside the collaboration tools integrated with them. For example – using the enrichment data within the incident to assist in opening ServiceNow tickets, prioritizing them, assigning them to specific teams, notifying teams via Pagerduty, triggering runbooks on Rundeck and more. 

 

Cross-domain enrichment – a must for cloud and hybrid environments

Now that we understand why enrichment is important for successful AIOps, let’s explore why cross-domain enrichment is crucial. This is especially true for cloud and hybrid environments.

Many teams working in today’s fast-paced environments find that their CMDB is often not very helpful. If a host name consists of a 16-digit combination of letters and numbers, and changes every few hours or minutes, trying to use a CMDB to connect a service to a CI is futile. CMDBs are helpful in static or slow-changing,  on-prem environments. But in modern hybrid and cloud environments they are almost always out-of-date. That’s why some teams don’t even bother to build or maintain one these days.

That’s where cross-domain enrichment can be extremely helpful. 

With cross-domain enrichment, enrichment data from any and all observability, monitoring, topology and other operational tools is brought into your AIOps tool. As previously mentioned, this includes enrichment data from inventory and asset management systems, cloud orchestration systems, APM/network maps and more (including, potentially, any CMDBs still in use.) Their data is used to enrich the alerts in real time. You can now easily understand complex dependencies by checking alerts. For example, you can see which microservices-based applications link to specific systems. You can also find out the container-based application ID associated with an alert, and well as other details.

 

The BigPanda Event Enrichment Engine

Now that we’ve understood the importance of enrichment, it’s time to discuss BigPanda’s Event Enrichment Engine.

BigPanda employs a best-in-class, cross-domain Event Enrichment Engine, which is coupled with the BigPanda platform’s Open Integration Hub and Open Box Machine Learning (OBML), to provide a market-leading AIOps-driven Event Correlation and Automation platform.

BigPanda’s Open Integration Hub ingests the three datasets critical to enrichment – and by extension, to successful AIOps solutions:

  • “Raw” streaming alert data from all enterprise observability and monitoring tools, 
  • Operational and topological context from all relevant sources, and 
  • Change data from all change feeds in the environment. 

This ensures that BigPanda’s Event Enrichment Engine has access to all the needed information for cross-domain enrichment.

BigPanda’s Event Enrichment Engine now enriches the ingested alerts by manipulating contextual data that’s buried in the incoming alert stream, and by adding new topological and operational data collected from the other sources and tools. You can use user-defined enrichment logic to perform millions of actions each day. This helps you scale your operations as needed.

BigPanda’s Event Enrichment Engine has a user-friendly interface. It helps you create and manage enrichment logic easily. You can also preview the results before making them live.

Composition

It also supports advanced enrichment use cases. This includes user-defined ordering of enrichment logic and powerful configuration APIs, among other features.

Rule-Reordering

Post-enrichment, enriched alerts are passed on to BigPanda’s OBML which leverages the added context to correlate the alerts into high-quality incidents, reducing IT noise by over 95%. BigPanda’s OBML also uses the enriched data to surface probable root cause – including root cause changes. Finally, the enrichment data within incidents is used to trigger workflow automations, helping to manage and remediate the incident.

As you can see, enrichment in general, and BigPanda’s Event Enrichment Engine in particular, has a significant impact on the entire incident management lifecycle, shortening, enhancing and/or increasing the quality of data in each stage, so that, human IT Ops teams are able to detect and resolve incidents and outages faster than ever before. In other words, AIOps can’t succeed without enrichment. It truly is the secret ingredient of successful AIOps.

To learn more about enrichment and BigPanda’s Event Enrichment Engine – please visit our Event Enrichment Engine product page.