Bringing Alert Management into the Present with Advanced Analytics
March 25, 2015

Kevin Conklin
Prelert

Share this

We have smart cars on the horizon that will navigate themselves. Mobile apps that make communication, navigation and entertainment an integral part of our daily lives. Your insurance pricing may soon be affected by whether or not you wear a personal health monitoring device. Everywhere you turn, the very latest IT technologies are being leveraged to provide advanced services that were unimaginable even ten years ago. So why is it that the IT environments that provide these services are managed using an analytics technology designed for the 1970s?

The IT landscape has evolved significantly over the past few decades. IT management simply has not kept pace. IT operations teams are anxious that too many problems are reported first by end users. Support teams worry that too many people spend too much time troubleshooting. Over 70 percent of troubleshooting time is actually wasted following false hunches because alerts provide no value to the diagnostic process. Enterprises that are still reliant on yesterday’s management strategies will find it increasingly difficult to solve today’s operations and performance management challenges.

This is not just an issue of falling behind a technology curve. There is a real business impact in increasing incident rates, failing to detect potentially disastrous outages and human resources wasting valuable time. An increasing number of IT shops are anxiously searching for alternatives.

This is where advanced machine learning analytics can help.

Too often operations teams can become engulfed by alerts – getting tens of thousands a day and not knowing which to deal with and when, making it quite possible that something important was ignored while time was wasted on something trivial. Through a powerful combination of machine learning and anomaly detection, advanced analytics can reduce the alarms to a prioritized set that have the largest impact on the environment. By learning which alerts are “normal”, these systems define an operable status quo. In essence, machine learning filters out the “background noise” of alerts that, based on their persistence, have no effect on normal operations. From there, statistical algorithms identify and rank “abnormal” outliers on a scale measuring severity (value of a spike or drop occurrence), rarity (number of previous instances) or impact (quantity of related anomalies). The result is a reduction from hundreds of thousands of noisy alerts a week to a few dozen notifications of real problems.

Despite producing huge volumes of alerts, rules and thresholds implementations often miss problems or report them long after the customer has experienced the impact. The fear of generating even more alerts forces monitoring teams to select fewer KPIs, thus decreasing the likelihood of detection. Problems that slowly approach thresholds go unnoticed until user experience is already impacted. Adopting this advanced analytics approach empowers enterprises to not only identify problems that rules and thresholds miss or simply execute against too late, but also provide their troubleshooting teams with pre-correlated causal data.

By replacing legacy rules and thresholds with machine learning anomaly detection, IT teams can monitor larger sets of performance data in real-time. Monitoring more KPIs enable a higher percentage of issues to be detected before the users report them. Through real-time cross correlation, related anomalies are detected and alerts become more actionable. Early adopters report that they are able to reduce troubleshooting time by 75 percent, with commensurate reductions in the number of people involved by as much as 85 percent.

Advanced machine learning systems will fundamentally change the way data is converted into information over the next few years. If your business is leveraging information to provide competitive services, you can’t afford to be the laggard.

Kevin Conklin is VP of Marketing at Prelert.

Share this

The Latest

May 22, 2017

APMdigest asked experts across the industry for their opinions on the next steps for ITOA. Part 5 offers some interesting final thoughts ...

May 19, 2017

APMdigest asked experts across the industry for their opinions on the next steps for ITOA. Part 4 covers automation and the dynamic IT environment ...

May 18, 2017

APMdigest asked experts across the industry for their opinions on the next steps for ITOA. Part 3 covers monitoring and user experience ...

May 17, 2017

APMdigest asked experts across the industry for their opinions on the next steps for ITOA. Part 2 covers visibility and data ...

May 16, 2017

Managing application performance today requires analytics. IT Operations Analytics (ITOA) is often used to augment or built into Application Performance Management solutions to process the massive amounts of metrics coming out of today's IT environment. But today ITOA stands at a crossroads as revolutionary technologies and capabilities are emerging to push it into new realms. So where is ITOA going next? With this question in mind, APMdigest asked experts across the industry — including analysts, consultants and vendors — for their opinions on the next steps for ITOA ...

May 15, 2017

Digital transformation initiatives are more successful when they have buy-in from across the business, according to a new report titled Digital Transformation Trailblazing: A Data-Driven Approach ...

May 11, 2017

The growing market for analytics in IT is one of the more exciting areas to watch in the technology industry. Exciting because of the variety and types of vendor innovation in this area. And exciting as well because our research indicates the adoption of advanced IT analytics supports data sharing and joint decision making in a way that's catalytic for both IT and digital transformation ...

May 10, 2017

Colin Fletcher, Research Director at Gartner, talks about Algorithmic IT Operations (AIOps) and the challenges and recommendations for AIOps adoption ...

May 09, 2017

In APMdigest's exclusive interview, Colin Fletcher, Research Director at Gartner, talks about Algorithmic IT Operations (AIOps) and how it will impact ITOA and APM ...

May 05, 2017

Microsoft is expected to essentially wind down Windows 7 support by 2020 so inevitably Windows 10 will be on the IT task list. It would be beneficial, now, to examine some of the issues relating to migrating to Windows 10 OS and how these pain points can be alleviated and addressed. Here are 7 practices that are key to facilitating migration ...