As part of APMdigest's 2025 Predictions Series, industry experts offer predictions on how DataOps and related technologies will evolve and impact business in 2025. Part 3 covers data technologies.
CONSOLIDATION OF DATA INFRASTRUCTURE
The Great Infrastructure Consolidation Will Accelerate: In 2025, we'll witness a significant consolidation of data infrastructure driven by economic pressures and the maturation of AI technologies. Organizations will move away from maintaining separate systems for streaming, batch processing, and AI workloads. Instead, they'll gravitate toward unified platforms that can handle multiple workloads efficiently. This shift isn't just about cost savings — it's about creating a more cohesive data ecosystem where real-time streaming, lakehouse storage, and AI processing work in harmony.
Sijie Guo
Founder and CEO, StreamNative
CONSOLIDATION OF DATA ASSETS
In 20025, organizations will focus on consolidating their data assets to build a unified foundation that powers future innovation, insights, and decision-making.
Ram Palaniappan
CTO, TEKsystems Global Services
CONTEXTUALIZING DATA
Contextualizing data will be the next frontier for data platforms. The evolution of the data platform is essential to the evolution of AI. Next year, we'll see breakthroughs that help LLMs better understand the data they're working with through the semantic layer. Today's data platforms are largely missing the semantic layer of data, which is the understanding of what the data means. For instance, when you have financial data in a table, it's typically the developer or the analyst who is tasked with understanding where that data came from, how it was calculated, and what it means — but this understanding should be baked directly into the data platforms. Having to rely on these additional stakeholders and build that understanding into every application you develop on top of your data is extremely burdensome. As a result, the semantic layer must be pushed down close to the data so that AI can understand the nature of it, and do a much better job at analyzing it. Users don't want to, and shouldn't have to, reinvent the semantic concepts for each application. They must push down to the data layer, that's the next evolution.
Benoit Dageville
President and Co-Founder, Snowflake
SMART INFRASTRUCTURE
We'll see the emergence of "smart infrastructure" platforms that automatically optimize resource allocation and data movement based on workload patterns and cost constraints.
Sijie Guo
Founder and CEO, StreamNative
DATA LAKEHOUSE
The Data Lakehouse Becomes the Standard for Analytics: More companies are moving from traditional data warehouses to data lakehouses. By 2025, over half of all analytics workloads are expected to run on lakehouse architectures, driven by the cost savings and flexibility they offer. Currently, companies are shifting from cloud data warehouses to lakehouses, not just to save money but to simplify data access patterns and reduce the need for duplicate data storage. Large organizations have reported savings of over 50%, a major win for those with significant data processing needs.
Emmanuel Darras
CEO and Co-Founder, Kestra
HYBRID LAKEHOUSE
The Rise of the Hybrid Lakehouse: The resurgence of on-prem data architectures will see lakehouses expanding into hybrid environments, merging cloud and on-premises data storage seamlessly. The hybrid lakehouse model offers scalability of cloud storage and secure control of on-premises, delivering flexibility and scalability within a unified, accessible framework.
Justin Borgman
Co-Founder and CEO, Starburst
SQL RETURNS TO THE DATA LAKE
SQL is experiencing a comeback in the data lake as table formats like Apache Iceberg simplify data access, enabling SQL engines to outpace Spark. SQL's renewed popularity democratizes data across organizations, fostering data-driven decision-making and expanding data literacy across teams. SQL's accessibility will make data insights widely available, supporting data empowerment.
Justin Borgman
Co-Founder and CEO, Starburst
OPEN TABLE FORMATS
Open table formats, particularly Apache Iceberg, are quickly gaining popularity. Iceberg's flexibility and compatibility with various data processing engines make it a preferred choice. Iceberg provides a standardized table format and integrates it with SQL engines as well as with data platforms, enabling SQL queries to run efficiently on both data lakes and data warehouses. Relying on open table formats allows companies to manage and query large datasets without relying solely on traditional data warehouses. With organizations planning to adopt Iceberg over other formats, its role in big data management is expected to expand, thanks to its strong focus on vendor-agnostic data access patterns, schema evolution, and interoperability.
Emmanuel Darras
CEO and Co-Founder, Kestra
POSTGRESQL: EVERYTHING DATABASE
In 2025, PostgreSQL will solidify its position as the go-to "Everything Database" — the first to fully integrate AI functionality like embeddings directly within its core ecosystem. This will streamline data workflows, eliminate the need for external processing tools, and enable businesses to manage complex data types in one place. With its unique extension capabilities, PostgreSQL is leading the charge toward a future where companies no longer have to rely on standalone or specialized databases.
Avthar Sewrathan
AI Product Lead, Timescale
DATA MESH
Data Mesh Gains Momentum Across Organizations: Data mesh is now more than an IT-driven strategy. It's increasingly led by business units themselves, with data mesh initiatives coming from non-IT teams focused on improving data quality and governance. Data mesh promotes decentralized data ownership, enabling business units to manage their data independently. This setup brings faster decision-making, more agility, and better data access.
Emmanuel Darras
CEO and Co-Founder, Kestra
DATA FABRIC
A data fabric will become accepted as a pre-cursor to using AI at scale: As businesses increasingly adopt AI to drive innovation, one key challenge remains: ensuring that AI is reliable, responsible, and relevant. AI solutions must be trained on real, company-specific data — not synthetic or generalized data — to deliver accurate, actionable insights. To make this a reality, more and more organizations will adopt a data fabric as their data strategy as it provides the semantics and rich business context that AI requires to be used in real business cases.
Daniel Yu
SVP, SAP Data and Analytics
MULTI-CLOUD NETWORKING
Enhanced Multi-Cloud Networking for Regulatory Compliance: By 2025, companies will increasingly rely on multi-cloud networking solutions, a capability required to meet diverse data sovereignty and industry-specific regulatory requirements. These advanced solutions will enable seamless connectivity and secure data transfer across cloud environments through robust encryption and access controls and they must also possess the critical ability to identify and remediate risks, threats, and vulnerabilities. CIOs and network architects will prioritize network designs that facilitate secure, efficient data flows, actively minimize regulatory risk, and maintain data integrity across cloud platforms.
Ali Shaikh
Chief Product Officer and Chief Operating Officer, Graphiant
STREAMING-FIRST APPROACH
Streaming-first approach grows with AI: Pressure will grow for more AI and applications to respond to real-time information to drive automation and meet the expectations of consumers. Organizations will adopt a "streaming-first" approach when architecting new applications. These applications are Event-Driven and will replace traditional application architectures that process data at rest and to a large extent involve request/response communication. This will also facilitate more data sharing of real-time data between totally different domains of a business than previously.
Guillaume Aymé
CEO, Lenses.io
STREAMING DATA PLATFORMS: OBSERVABILITY AND SECURITY
In 2025, streaming data platforms will become indispensable for managing the exponential growth of observability and security data. Organizations will increasingly adopt streaming data platforms to process vast volumes of logs, metrics, and events in real-time, enabling faster threat detection, anomaly resolution, and system optimization to meet the demands of ever-evolving infrastructure and cyber threats.
Bipin Singh
Senior Director of Product Marketing, Redpanda
STREAMING DATA PLATFORMS: AI
In 2025, streaming data platforms will serve as the backbone for agentic AI, RAG AI and sovereign AI applications, providing the low-latency, high-throughput capabilities required to power autonomous decision-making systems and ensuring compliance with data sovereignty requirements.
Bipin Singh
Senior Director of Product Marketing, Redpanda
REAL-TIME DATA STREAMING FABRIC
Businesses will look to accelerate hyper-connecting applications and architectures across all parts of their business through real-time data streams. This "streaming fabric" across a business will blur the lines between previously isolated different AI, analytics and software architectures and allow connecting systems across business lines such as finance, ecommerce, manufacturing, distribution, supply chain. This connectivity will allow applications to be built that offer new digital consumer-facing services as well as ones that provide new levels of automation within a business.
Guillaume Aymé
CEO, Lenses.io