APM The Truth is Out There - Application Monitoring in a World of Intentional Chaos
November 02, 2017

Pete Waterhouse
CA Technologies

If you're a Sci-fi nut you've probably checked out Blade Runner 2049. The seminal 1982 original is still my favorite, although the new movie ticks many boxes. Sure, it's a dark depiction of a dystopian future, but I love all the cyberpunk tech and references to artificial intelligence.

There's one AI nugget that I really enjoyed. It's the part where K (the main protagonist – and Blade Runner) has to undergo a rapid response baseline test after any traumatic event – like "retiring" a rogue replicant. In his first test K passes with flying colors but later fails – he's way off his baseline. In each test his responses seemed consistent, but the machine conducting the test detects behavioral anomalies, which for a replicant are, well – life threatening.

This all got me thinking that managing app performance across today's complex tech is kind of similar. Detecting "normal" application performance is becoming much harder. Unlike the past where we had time to find and fix performance problems across apps that behaved in predictable ways, modern distributed systems are much less forgiving. Containers can be replaced in seconds and apps can consist of 1000s of independent microservices. What used to be considered bad – like for example high CPU utilization might now be expected, even desired. Confusing, chaotic, counterintuitive.

A Map Without Context is Now Next to Useless

To gain visibility into application topology we've always relied on our trusted maps. This was fine when representing fairly static topology with a known number of tiers. But in a world of ephemerality and immutable infrastructure this comes up short. Even if we can capture all the containers and dependencies, how do we represent them with limited console real-estate? Where do we drill-down to gain more clarity? The simple answer is we don't know – the topology is too dense.

But in "X files" speak – "the truth is out there". We just have to take what's chaotic and convert it into something that makes sense and has purpose. This is both a challenge and opportunity. It's challenging because we've tended to view mapping from a narrow perspective. Since Ops has controlled the apps and the infrastructure they build the maps. Great for production support, but with increased application abstraction and empowered development these views have limited scope. They also lack an essential ingredient – context. Even if we could distill these new dynamic environments into palatable views, would they be useful for an app component developer who just wants to understand how her new code impacts performance in a software build? Maybe not.

Chaos is Intentional – Don't Fight It, Work With It

But as it turns out all this complexity provides an opportunity to do some wonderful things – but only if we work counterintuitively. Turning to what we fear most – the chaotic application componentry itself. This means leveraging the descriptors or attributes available to fuel a new data model that supports multiple constituents and use-cases.

And there's so much we can unlock. Docker for example provides a raft of attributes which my colleague Amy Feldman discussed in a great article. Outlining how these can be leveraged with a multi-dimensional data model to deliver different views or perspectives to many teams. And because a component can have many attributes – like location, AWS availability zone, service owner, code version, build number - we can present the data in context of roles and functions - pivoting when needed. So as an app support manager I can visualize performance from a service level perspective, while over in the next cube my app development colleague can be using the same solution to visualize application performance in context of her code. But we can take this much further.

Elevate Your Monitoring Practice to a Whole New Level

Together with all the modern tech, leading organizations are also employing modern agile organizational practices that better align previously fractured and silo's teams around business goals. Rather than have separate dev and ops teams organized by tech function, they fuse these into vertical units responsible and collectively accountable to specific outcomes. Each of these could contain front-end developers, DBA's, sysadmins and so. Going further, many organizations also form horizontal cross-team groups where specialists (like front-end developers) get together to share ideas. Spotify, who are probably the best exponent of this model go even further, organizing communities (or guilds) that have common interests irrespective of an individual's own specialization.

Now imagine that as we re-organize, we can leverage the data model described above to surface application performance insights that immediately reflect the outcome-driven structure. So as specialists get together they can share relevant and valuable information. This could be a mobile app dev group sharing which coding practice they've used to drive better performance – what worked, what didn't, and why. Or perhaps it's a team of cloud architects comparing performance trends and anomalous conditions across different microservice deployment patterns – determining where the business is getting most value from its investments. Call this Agile, DevOps – whatever. To me it's a system enabling people to do awesome things for their organization – and helping them grow as they do it.

With all the complexity and chaos, it's easy to get spooked. But before hitting the panic button, always remember "truth is out there" - somewhere. You just need a modern application monitoring solution to find it – one backed by an open and flexible data model.

Pete Waterhouse is Senior Strategist at CA Technologies
Share this