Root Cause Changes: Real Examples of Modern Root Cause Analysis
Root Cause Analysis (RCA) is an all-encompassing process.
It is often complicated and needs many people with different skills. They all work together to figure out what happened, when it happened, why it happened, how it happened, and who is to blame.
There is a special solution today that can help fix many problems. This can happen before starting a full RCA process. This solution is called Root Cause Changes (RCC). Most problems and outages today are caused by changes to infrastructure and software. This accounts for over 80% of issues, according to Gartner. You just need to find out what changed. This will help you resolve the incident quickly.
The problem is that finding these changes is not easy. In today’s fast-paced IT environment, the number and frequency of changes have skyrocketed. Some of our customers report over 4000 changes every week. These changes encompass nearly all aspects of our operations, including deployments, software upgrades, configuration changes, scaling, and more. They appear in many different tools, which makes visibility a challenge. To make matters worse, many changes occur automatically or by mistake, often without our knowledge.
That is why BigPanda’s new Root Cause Changes feature is so important. BigPanda is the only AIOps solution that automatically analyzes data from all change tools. This includes CI/CD tools. It connects this data to all monitoring alerts. This helps quickly find the root cause of changes and allows for immediate resolution.
Here are some examples of what our beta customers experienced with Root Cause Changes. We changed the host and service names for privacy
Simple Security Patch Leads to Complicated Performance Issues
The first example is a common issue. Installing a security patch can cause performance problems for users.
The incident begins at 6:54 pm. DataDog sends alerts about “a high count of error logs” in MS Exchange
BigPanda shows the main cause of the change. It was the installation of a needed security hotfix. This hotfix is related to a recent Wintel vulnerability. The change happened at 5:24 pm
The incident was resolved by restarting the application pools.
Firewall Changes Cause E-Commerce Interruptions
At 6:27 pm, the NOC gets an alert from AppDynamics. The alert is about slow performance for users on an e-commerce kiosk solution
BigPanda surfaces the root cause change: at 5:21 pm on the same day, two firewall zones were migrated. During the change, some queues were stopped to route the traffic, but they did not restart when the change was completed:
Shipping Module Update Crashes Mobile App
At 4:05 pm, the NOC receives alerts associated with crashes on the mobile app and multiple function executions:
BigPanda shows the main cause of the change. At 3:52 PM, AWS Lambda functions were removed from a mobile app gateway. This was done to free up IP space
Rolling back this change resolved the issue.
With BigPanda’s Root Cause Changes feature, you can easily see all changes related to an incident. Just click on the “Related Changes” tab to find the Root Cause Change for that incident.
It’s that simple.
Want to learn more about Root Cause Changes and what BigPanda can do for your IT environment and your enterprise? Schedule a demo now to get started.









