In APMdigest's 2025 Predictions Series, industry experts — from analysts and consultants to the top vendors — offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 2 covers AI's impact on Observability, including AI Observability, AI-Powered Observability and AIOps.
AI OBSERVABILITY
Speed of adoption is leaving blind spots in GenAI rollouts. LLM observability is becoming a critical focus for enterprises. As organizations build AI solutions at scale, monitoring model performance, detecting hallucinations, and optimizing costs will drive rapid innovation in LLM-specific observability tooling.
Gagan Singh
VP, Product Marketing, Elastic
A new facet of observability supports large language models (LLMs) and AI workflows, focusing on monitoring API calls, vector databases, model performance, and GPU usage. As organizations incorporate AI, observability platforms must track infrastructure and specifics of LLM performance and API cost-effectiveness, as AI token mismanagement can skyrocket expenses. Observability for AI is distinct from AI-powered observability, which integrates AI to streamline traditional monitoring and debugging. This dual approach addresses the unique demands of managing AI systems, ensuring models operate smoothly while controlling costs associated with high data processing.
Sam Suthar
Founding Director, Middleware
Although AI observability is a fairly new conversation, 2025 is the year it goes mainstream. In enterprise systems, observability often refers to the ability to see and understand the state of the system. The emerging field of AI observability examines not only the performance of the system itself, but the quality of the outputs of a large language model — including accuracy, ethical and bias issues, and security problems such as data leakage. I view AI observability as the missing puzzle piece to building explainability into the development process, giving enterprises faith in their AI demos to get them across the finish line.
We'll see more and more vendors come out with AI observability features to meet the growing demand in the market. However, while there will be many AI observability startups, observability will ultimately end up in the hands of data platforms and the large cloud providers. It's hard to do observability as a standalone startup, and companies that adopt AI models are going to need AI observability solutions, so big cloud providers will be adding the capability.
Baris Gultekin
Head of AI, Snowflake
MONITORING GENAI CODE
Traditional monitoring approaches will prove inadequate as design patterns silently break down, system boundaries blur, and unexpected performance issues surface. Forward-thinking engineering teams will shift focus from code generation to deep architectural understanding, implementing new tools that monitor how AI-generated code impacts how systems evolve and detecting application design problems before they cascade. New capabilities and methodologies will be required to deal with the mass of generated code which will come with its share of AI hallucinations. Success with GenAI isn't about writing more code faster, but about maintaining architectural integrity across application ecosystems.
Ori Saporta
VP of Engineering and Co-Founder, vFunction
MLOps Evolution to AIOps
In 2025, we'll see the evolution from traditional MLOps to comprehensive AIOps platforms that manage the entire AI system life cycle. These platforms will integrate sophisticated monitoring and automation capabilities for both models and infrastructure, enabling predictive maintenance and automatic optimization of AI systems. Teams will adopt practices that treat AI models as living systems rather than static deployments, with continuous learning and adaptation capabilities built into the deployment pipeline. This shift will require new tools and practices for version control, testing, and deployment that can handle the complexity of multi-modal models and distributed training environments.
Haoyuan Li
Founder and CTO, Alluxio
TRANSFORMING OBSERVABILITY WITH AI
As we look ahead to 2025, the convergence of AI and observability will fundamentally transform how organizations manage their technology stack.
Karthik Sj
GM of AI, LogicMonitor
Just about every organization out there has the adoption (and scale) of new AI tools and technologies on its digital transformation roadmap for 2025. Getting those decisions right will play a key role in improving application performance by automating tasks, increasing efficiency, and learning from feedback. Anticipate more major investments — as well as fresh development processes and workflows — in the coming year, as organizations retool for a competitive environment where application performance and responsiveness to trends will provide a competitive upper hand.
Anil Inamdar
Global Head of Data Services, NetApp Instaclustr
AI-powered observability tools will revolutionize IT operations by automatically detecting patterns and potential issues before they impact systems, while significantly reducing the manual effort needed for root cause analysis. Natural language processing capabilities will enable IT professionals to query their observability data conversationally, making complex system analysis more accessible to team members regardless of their technical expertise.
Arturo Oliver
Director of Market Strategy and Analyst Relations, ScienceLogic
AI-driven predictive operations
In 2025, we will see a shift towards AI-driven predictive operations for IT. While we're unlikely to see massive adoption in 2025, I believe we'll see many vendors incorporate AI-driven predictive operations into their product strategy as the urgency for more efficient IT operations grows larger than ever. This need is fueled by the exponential growth in data volume and complexity, which will make traditional monitoring and management approaches more unsustainable than ever. AI and machine learning algorithms will be leveraged to predict issues before they impact business operations and employee productivity, optimizing system performance and augmenting decision-making processes for IT. Long term, I am hopeful we'll realize significantly reduced downtime, improve operational efficiency, and enable more proactive IT management strategies. As businesses continue to demand higher levels of reliability and performance with fewer resources, AI-driven predictive operations will become essential in meeting these expectations.
Peter Pflaster
Senior Manager, Product Marketing, Automox
There's a growing trend in customer requirements for proactive observability solutions predicting and preventing issues. Customers increasingly seek observability systems forecasting potential service outages, predicting capacity issues, and identifying performance degradation ahead of time. This proactive approach helps organizations mitigate risks and manage resources efficiently, responding to problems before they impact end-users. Predictive alerting in observability, supported by AI, is expected to become an industry standard, enhancing service reliability and reducing unplanned downtime.
Sam Suthar
Founding Director, Middleware
AIOPS AUTOMATION
One trend that will evolve next year in observability is AIOps. Specifically AI in IT operations. We'll see an influx of automation. According to Splunk's State of Observability report, 52% of survey respondents already leverage AIOps within their observability toolset, while another 29% are in the deployment process. Almost half of those already employing AIOps are using it for automation, determining root causes, and responding to and remediating incidents with greater intelligence."
Mark Maslach
GVP, Observability, Splunk
CONVERGENCE OF GENAI AND AIOPS
Next year, the IT and AIOps landscape will increasingly incorporate GenAI as a foundational element in AIOps products, moving from an enhancement to a core capability. This shift is expected to spur new use cases, where GenAI acts as a bridge between human operators and autonomous systems, enabling more seamless, proactive IT operations. However, achieving a fully automated IT environment may remain challenging, particularly for industries still bound by legacy systems and traditional service models that prioritize full-time equivalent (FTE)-based contracts. Although this labor-intensive model limits the pace of AI-driven change, a heightened focus on cost reduction will drive greater automation adoption. The end-user interface with AIOps will improve, meaning prediction and prevention of issues will also improve.
Ugo Orsi
COO, Digitate
COVERGENCE OF AI-DRIVEN AUTONOMOUS DEBUGGING WITH AIOps
AI-powered autonomous runtime debugging will likely emerge by 2025 as a complement to traditional live debugging tools, marking a new maturity level in observability. This advancement is expected to converge with AIOps (Artificial Intelligence for IT Operations), driving automation in root-cause analysis to pinpoint and diagnose issues at the code level with minimal human intervention. Simultaneously, it will align with AI Code Assistants, supporting developers in creating more robust applications and accelerating issue resolution.
Autonomous debugging tools will enable developers to detect issues early and trace errors back to specific lines of code, transforming critical issue tickets into actionable insights. By analyzing both historical and real-time data, these AI-driven tools can also provide proactive recommendations and even enable auto-remediation for recurring issues. Additionally, when paired with AI code assistants, autonomous debugging positions developers to build more reliable, high-quality applications with greater efficiency and precision.
Ilan Peleg
CEO, Lightrun
CLOUD MAKES AIOPS A NECESSITY
As enterprises continue migrating to the cloud, AIOps will continue to evolve into an enterprise operational necessity. By unifying observability data with AI-driven insights, businesses can expect better end-to-end visibility, and proactive issue detection and resolution, fortifying security and compliance at scale.
Gab Menachem
VP ITOM, ServiceNow
CONVERGENCE OF DEVOPS AND AIOPS
The Rise of Unified Ops Teams: By the end of 2025, the lines between DevOps and AIOps will begin to blur, giving rise to unified operations teams that seamlessly integrate AI expertise with traditional software development and IT operations. These teams will manage both the software lifecycle and AI model lifecycles, enabling real-time adaptation and continuous learning. Organizations that adopt this model will see accelerated innovation, improved agility, and a competitive edge as they fully capitalize on AI-driven efficiencies.
Justin Holtzinger
Chief Revenue Officer, DevOps, Cirata
AI-POWERED DEBUGGING TOOLS
AI-powered debugging tools will revolutionize troubleshooting by automatically identifying root causes and suggesting fixes. These systems will analyze application behavior, logs, and performance metrics in real time to detect anomalies and correlate issues across distributed systems, significantly reducing the time needed to resolve problems.
Tristan Stahnke
Principal Application Security Consultant, GuidePoint Security
SPECIAL-PURPOSE LLM
Special-purpose LLMs will give GenAI a more strategic role in modern IT while empowering human ingenuity: Generative AI will move from novel data synthesis to domain subject matter expertise in AIOps, driven by the combination of expertise encoding, retrieval-augmented generation, and LLMs. Special-purpose LLM deployments will effectively model expert decision-making across multiple ITOM and service assurance process areas. This strategic shift will mark a turning point within today's IT organizations, with generative AI playing a more streamlined and specialized role in operational areas, freeing up more time for innovation.
Casey Kindiger
CEO, Grokstream
GENAI-ENABLED OBSERVABILITY NOT FULLY AUTOMATED YET
In 2025 Observability will embed more GenAI capabilities to explain and enable real-time incident identification, but it will not become a fully automated capability in 2025, as to achieve this organizations will want to train any model on its own data, rather than use the generic anonymized data the pre-trained Observability solutions will use.
Roy Illsley
Chief Analyst, Omdia
AI CRAZE REACHES PLATEAU
Enterprises yearn for AI-driven insights in observability solutions, hoping to turn raw telemetry data into actionable intelligence. But I expect this craze for AI to gradually reach a plateau, as organizations begin to recognize its limitations.
Arun Balachandran
Sr. Marketing Manager - Applications Manager, ManageEngine