As CIOs Address App Sprawl, Observability Can't Be an Afterthought
May 09, 2024

Bill Lobig
IBM

Share this

App sprawl has been a concern for technologists for some time, but it has never presented such a challenge as now. As organizations move to implement generative AI into their applications, it's only going to become more complex. In fact, a recent Canva report found that 72% of CIOs see application sprawl as a challenge — and with 71% of CIOs expecting to adopt 30-60 new apps this year, this complexity is poised to keep growing.

Potential solutions include consolidating applications, optimizing workflows, and automating IT processes to reduce strain on technologists so they can tackle the issue of app sprawl head-on. While these are all valid and necessary approaches, observability is a necessary component for understanding the vast amounts of complex data within AI-infused applications, and it must be the centerpiece of an app- and data-centric strategy to truly manage app sprawl.

Cracking the Code for AI App Sprawl Challenges

In a year of elevated global IT spend, ensuring investments aren't wasted is a necessity for overwhelmed technology leaders, who not only must make decisions around which technologies to implement, but also make sense of application performance amid growing tides of vast and complex data.

When AI enters the mix, it's even more important to have complete visibility, which many organizations still lack. Observability tools and practices can help technologists address AI app sprawl by providing visibility into the performance, behavior, and dependencies of AI applications. Unfortunately, many teams are still attempting to work with incomplete visibility, meaning they simply don't know what they don't know.

Simply put, traditional application performance monitoring (APM) tools can provide visibility to a certain degree, but they weren't built to necessarily account for the influx of generative AI applications that modern enterprises are dealing with.

End-users for generative AI-infused applications demand continuous availability and frictionless experience. However, with a lack of real-time visibility, they will feel when an outage or delay compromises their experience, particularly as applications span numerous platforms. Not to mention, the implementation and understanding of complex generative AI models' behaviors are still something many organizations are working to figure out. These potential blind spots could create significant performance, compliance, and security issues.

Observe the Full Stack So You Can Add to It

Organizations can better understand the internal state of their AI applications by analyzing their external outputs with observability. When this is connected to business outcomes and effectively addressed, technology teams can lay the groundwork for a well-functioning application monitoring and management process. So, when generative AI is introduced to the environment, the foundation is already in place to effectively see and optimize application processes.

As generative AI applications join the enterprise equation, observability tools are a must-have for facilitating the delivery of higher-quality software at a faster pace — these are best enabled through:

Finding and fixing the "unknown unknowns": You can't fix what you can't see. Unfortunately, many monitoring practices and tools can only address flaws that are previously known. Observability uncovers conditions that would be impossible to find manually or with traditional platforms. It then monitors the correlation to different performance flaws and gives context for discovering root causes, resulting in quick and easy remediation.

Detecting and remediating issues early on: With observability, monitoring is integrated into the initial stages of software development. It's then easy to pinpoint and rectify new code issues before they affect the service level agreements (SLAs) and customer experience.

Self-healing application infrastructure and automated resolution: Observability can be coupled with automation capabilities to anticipate issues from system outputs and resolve them autonomously without requiring manual intervention.

Scaling and load balancing: We need to be able to observe and control the current load on systems but also help forecast future demands. The data can be used to optimize applications in real-time without having the end-users feel any impact.

Cost management: From optimizing workloads to computational resources, SREs can and should find ways to save on VM, GPU, cloud and inferencing costs through observability.

Observability is crucial for organizations to address AI app sprawl and overcome myriad challenges that come along with it, especially as generative AI becomes a mainstay across enterprises. By following observability best practices and deploying the right automated tools, organizations can proactively identify and resolve issues and ensure all AI applications are always available and friction-free.

Bill Lobig is VP of Apptio / IBM IT Automation
Share this

The Latest

September 12, 2024

The OpenTelemetry End-User SIG surveyed more than 100 OpenTelemetry users to learn more about their observability journeys and what resources deliver the most value when establishing an observability practice ... Regardless of experience level, there's a clear need for more support and continued education ...

September 11, 2024

A silo is, by definition, an isolated component of an organization that doesn't interact with those around it in any meaningful way. This is the antithesis of collaboration, but its effects are even more insidious than the shutting down of effective conversation ...

September 10, 2024

New Relic's 2024 State of Observability for Industrials, Materials, and Manufacturing report outlines the adoption and business value of observability for the industrials, materials, and manufacturing industries ... Here are 8 key takeaways from the report ...

September 09, 2024

For mission-critical applications, it's often easy to justify an investment in a solution designed to ensure that the application is available no less than 99.99% of the time — easy because the cost to the organization of that app being offline would quickly surpass the cost of a high availability (HA) solution ... But not every application warrants the investment in an HA solution with redundant infrastructure spanning multiple data centers or cloud availability zones ...

September 05, 2024

The edge brings computing resources and data storage closer to end users, which explains the rapid boom in edge computing, but it also generates a huge amount of data ... 44% of organizations are investing in edge IT to create new customer experiences and improve engagement. To achieve those goals, edge services observability should be a centerpoint of that investment ...

September 04, 2024

The growing adoption of efficiency-boosting technologies like artificial intelligence (AI) and machine learning (ML) helps counteract staffing shortages, rising labor costs, and talent gaps, while giving employees more time to focus on strategic projects. This trend is especially evident in the government contracting sector, where, according to Deltek's 2024 Clarity Report, 34% of GovCon leaders rank AI and ML in their top three technology investment priorities for 2024, above perennial focus areas like cybersecurity, data management and integration, business automation and cloud infrastructure ...

September 03, 2024

While IT leaders are preparing organizations for accelerated generative AI (GenAI) adoption, C-suite executives' confidence in their IT team's ability to deliver basic services is declining, according to a study conducted by the IBM Institute for Business Value ...

August 29, 2024

The consequences of outages have become a pressing issue as the largest IT outage in history continues to rock the world with severe ramifications ... According to the Catchpoint Internet Resilience Report, these types of disruptions, internet outages in particular, can have severe financial and reputational impacts and enterprises should strongly consider their resilience ...

August 28, 2024

Everyday AI and digital employee experience (DEX) are projected to reach mainstream adoption in less than two years according to the Gartner, Inc. Hype Cycle for Digital Workplace Applications, 2024 ...

August 27, 2024

When an IT issue is not handled correctly, not only is innovation stifled, but stakeholder trust can also be impacted (such as when there's an IT outage or slowdowns in performance). When you add new technology investments and innovations into the mix, you have a recipe for disaster ...