In APMdigest's 2025 Predictions Series, industry experts — from analysts and consultants to the top vendors — offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 6 covers cloud, the edge and IT outages.
The cloud-related predictions presented here are related to IT performance, so they are included with the Observability predictions. After the Holidays, APMdigest will publish a separate list of cloud and finops predictions that cover a broader range of cloud topics.
CLOUDOPS
CloudOps will emerge to become the "uber" management platform linking domain specific management tools so an holistic perspective and a more business-focused perspective can be delivered. However, this change will require some organizational changes and the CloudOps solutions must remain open and modular. If vendors produce CloudOps stacks then adoption will be impacted.
Roy Illsley
Chief Analyst, Omdia
MOVING LIFTED AND SHIFTED APPS TO CLOUD-NATIVE
Organizations continue to move lifted and shifted apps to cloud-native. Cloud adoption continues to be a journey with enterprises still trying to get cost-effective visibility across hybrid environments due to the complexity of modern and legacy applications.
Gagan Singh
VP, Product Marketing, Elastic
HYBRID CLOUD FAILOVER CLUSTERING
Greater Adoption of Hybrid Cloud Failover Clustering - Enterprises will increasingly implement hybrid cloud architectures, combining on-premises data centers with public cloud platforms for failover clustering. This setup will ensure high availability and disaster recovery while offering flexibility and cost-effectiveness. Organizations will prioritize failover solutions that seamlessly bridge on-prem and cloud environments, allowing IT teams to leverage the cloud's resilience without abandoning existing infrastructure.
Cassius Rhue
VP, Customer Experience, SIOS Technology
Cross-Region Multi-Cloud Failover Clustering
Increased Reliance on Cross-Region, Multi-Cloud Failover Clustering - To achieve stronger resilience against regional outages, enterprises will adopt cross-region and multi-cloud failover clustering strategies. These setups will allow critical applications to failover seamlessly across different cloud regions or cloud providers, ensuring continuity even in cases of large-scale disruptions. This trend will drive demand for clustering solutions capable of handling complex, geographically distributed infrastructures and automating failover processes across subnets and cloud regions with minimal manual intervention.
Cassius Rhue
VP, Customer Experience, SIOS Technology
Shift from Cost to Risk Mitigation
Enterprises will increasingly view cloud infrastructure as a tool for risk mitigation rather than just a means to cut costs. As organizations balance hybrid infrastructures with regulatory demands and the need for resilient systems, they'll focus on creating secure environments that safeguard data, support AI-driven operations, and withstand unpredictable outages and cyber threats. Standards like DORA (DevOps Research and Assessment) will play a crucial role in guiding companies in establishing reliable, secure, and efficient cloud architectures that prioritize resilience and reduce operational risks across complex, distributed environments.
Mehdi Daoudi
CEO and Founder, Catchpoint
MULTI-AGENT
In 2025, hybrid cloud environments will see a major shift toward advanced observability solutions using multi-agent. This surge in adoption will work to unify visibility across on-premise, cloud, and multi-cloud infrastructures, using AI and machine learning to predict and prevent issues before they occur. Organizations need tools that can deliver a real-time view of complex, interconnected systems that empower IT teams to proactively manage discrepancies, optimize performance, and enhance security. With these capabilities, companies will be better equipped to ensure reliable performance, security, and scalability across diverse infrastructures.
Gab Menachem
VP ITOM, ServiceNow
EDGE COMPUTING
Edge computing will continue to grow, processing data closer to its source to reduce latency and improve responsiveness. Autonomous vehicles and smart cities will hinge on edge computing. More people driving AVs threatens to overwhelm wireless networks, computing resources, and cloud infrastructure due to the massive increase in data, computation, and energy requirements. A hybrid infrastructure, combining edge and cloud computing will provide the high connectivity required for AVs, generative AI, and smart cities. By processing data closer to the source — the edge — developers reduce latency and ensure immediate responses, such as turning left at a crossroads. JLL suggests that the global edge data center market is expected to cross US$300bn by 2026, with technology trends like GenAI and IoT powering growth.
Sashank Purighalla
Founder and CEO, BOS Framework
Edge Observability
Edge Compute - The New Frontier in 2025: Edge computing will emerge as the new frontier, enabling real-time data processing closer to where it's generated—whether in autonomous vehicles, IoT devices, or remote facilities. By minimizing latency and reducing the load on centralized cloud resources, edge computing will transform industries like manufacturing, healthcare, and retail with faster, more reliable data-driven insights. This shift empowers applications that demand ultra-low latency, increased security, and local processing capabilities, pushing businesses toward a future where edge intelligence enhances user experiences, operational efficiency, and scalability like never before.
Mehdi Daoudi
CEO and Founder, Catchpoint
Data overload will remain a challenge, especially at the edge, where transmitting and processing vast amounts of telemetry data can be costly and slow. In 2025, solutions addressing these issues will gain traction, with more intelligent edge processing and filtering to prioritize actionable insights. This will enable organizations to achieve observability for edge workloads without overwhelming central systems.
Andreas Prins
VP of Product Marketing, SUSE
EDGE FUNCTIONS
As 2025 approaches, hybrid and multi-cloud setups will continue to grow, combining the flexibility of public clouds with the control of private ones. A key part of this shift will be Edge Functions, which let code run closer to the user through content delivery networks. This reduces latency and allows real-time, location-based updates. For industries like IoT, telecommunications, and media, where low-latency responses are vital, these advancements will help them innovate faster and offer more responsive services. Pairing this with Private AI will only accelerate the pace of change.
Avishai Sharlin
Division President, Product and Network, Amdocs
MAJOR OUTAGES COMING IN 2025
IT will brace for nation-state cyberattacks and global outages in 2025: IT will have to manage an increasingly uncertain world in 2025. Nation-state-driven cyberattacks will become more frequent and sophisticated, targeting businesses and critical infrastructure. These attacks, combined with inevitable widespread outages affecting major service providers and platforms will extend beyond companies and governments, making the impact more personal to consumer lives. As a result, IT will be under pressure to strike a balance to fortify their systems from being caught in the crossfire of larger global cyber conflicts and outages while also focusing on the basics — like assessing and securing the configurations of their identity systems. Preparing for complex, high-level attacks is essential, but so is ensuring that fundamental defenses, such as "locking the doors," are not overlooked. This dual focus will be crucial in building a cohesive strategy to mitigate evolving cyber threats.
Bryan Patton
Principal Strategic Systems Consultant, Quest Software
We have seen a snowball effect of outages and supply chain compromises as development tools and production services are now cloud-hosted, not self-hosted. In the coming year, we will see several more public incidents due to environmental or cyber incidents that leave mission-critical services inaccessible and compromise IP.
Simon Taylor
CEO and Co-founder, HYCU
Navigating Mega and Micro-Outages in a Hybrid-First World: In 2025, digital infrastructures will reach unprecedented levels of complexity, making outages an inevitable part of the landscape. This year will see a rise in both industry-wide mega-outages and subtle micro-outages — small disruptions that can chip away at customer trust over time. The critical issue is not simply hybrid versus cloud but rather the entropic nature of technology itself: as systems become more distributed and complex, they're inherently harder to manage and predict. To maintain resilience, companies will need a strategic, adaptive approach that focuses on rigorous monitoring and faster response times to both large-scale and minor disruptions, protecting their brand and fostering reliable customer engagement amidst this volatile digital environment.
Mehdi Daoudi
CEO and Founder, Catchpoint
With today's complex application ecosystems, software outages have the potential to rise in 2025, as seen in recent incidents like CrowdStrike's. Without strong quality assurance practices, a simple code change can be all it takes to cause blanket disruption to a business. It's definitely becoming more difficult for software teams to keep up with the progressively demanding schedules for code releases, while still thoroughly testing more frequent and complex deployments. If I could give organizations one piece of advice to avoid future outages, it would be this: do not overlook the critical function of software testing in your business. Assess risk, determine where to focus your testing efforts and regularly review your testing and deployment strategies. Don't let yourself become the next global example of technological failure.
Mav Turner
Chief Strategy Officer, Tricentis
AI RESILIENCE PROBLEM
2025 will reveal AI's resilience problem: In 2025, increased AI adoption and innovation will reveal that AI has a resilience problem that must be addressed. Application, data center and network outages are unfortunately an all too common occurrence, but how do these legacy IT challenges impact our new world of AI-powered everything? AI applications are very resource intensive, but the reality is that we've only scratched the surface due to AI's currently limited use within most organizations. As AI applications and models scale in adoption, we will see the detrimental impact of systems and tools that haven't prioritized ultra-resilience as a core design principle.
Karthik Ranganathan
Co-Founder and Co-CEO, Yugabyte
Go to: 2025 NetOps Predictions