Most companies are highly instrumented to detect alerts and symptoms within their production environments (e.g. network latency, database transaction failures, high CPU load, etc.). Yet the same companies often have precious little capability around detecting, logging & correlating changes that occur within their production stack.
The pace of change with modern IT environments has dramatically accelerated in recent years. Continuous delivery has allowed developers to move orders of magnitude faster than several years ago. Once, code was deployed once a week/month/quarter. Today code deploys happen up to 50 times a day.
IT is moving faster as well. Programmatic interfaces such as Chef, Puppet, and AWS’s API have given IT an order of magnitude more control & flexibility over their infrastructure. Provisioning of hosts, installing software, & upgrading security policies on hundreds of machines have become as easy as running a shell command.
As the pace of change has increased, many companies are discovering that they have painful blind spots when it comes to tracking what’s changed, when it changed, who changed it, and what outages may have resulted. Ops managers are waking up to the fact that they need a centralized solution for change tracking in order to do their job.
For our customers, the ability to visualize recent code & infrastructure changes in the context of system health alerts (from Nagios, New Relic, Pingdom, etc.) has given them meaningful insight into the root cause of production outages. It’s helped them troubleshoot faster when the stuff hits the fan.
As the pace of change continues to accelerate in the years to come, we expect this issue to become even more acutely painful than it is today. If you’re not already set up to quickly detect changes in your production environment, this might be a good time to start by getting a free account at BigPanda.