Alluxio Founder, Chairman and CTO Reveals Top Data Predictions for 2020

January 14, 2020

Hybrid cloud, Machine Learning, Kubernetes, talent gap, Hadoop compute and China innovation topping the list

SAN MATEO, CA – January 14, 2020 - Alluxio's Founder, Chairman and CTO Haoyuan (H.Y.) Li forecasts seven major developments in cloud, AI, DevOps, data analytics and storage in 2020. Most organizations are in the early stages of the data revolution running many different workloads on a wide variety of platforms across clouds and hybrid clouds. 2020 will see even more advances in AI, machine learning and analytic workloads and technologies to support them.

Haoyuan Li outlines the following seven major trends that guide his predictions:

Rise of the hybrid cloud (really)
We've been hearing people talk about the hybrid cloud for the past three years now. And for the most part, that's all it's been - talk. 2020 is the year it gets real. We are seeing large enterprises refusing to add capacity on-prem to their Hadoop deployments and instead invest in the public cloud. But they are still not willing to move their core enterprise data to the cloud. Data will stay on-prem and compute will be burst to the cloud, particularly for peak demands and unpredictable workloads. Technologies that provide optimal approaches to achieve this will drive the rise of the hybrid cloud.

One Machine Learning framework to rule them all
Machine learning with models has reached a turning point, with companies of all sizes and at all stages moving towards operationalizing their model training efforts. While there are several popular frameworks for model training, a leading technology hasn't yet emerged. Just like Apache Spark is considered a leader for data transformation jobs and Presto is emerging as the leading tech for interactive querying, 2020 will be the year we'll see a frontrunner dominate the broader model training space with pyTorch or Tensorflow as leading contenders.

"Kubernetifying" the analytics stack
While containers and Kubernetes works exceptionally well for stateless applications like web servers and self-contained databases, we haven't seen a ton of container usage when it comes to advanced analytics and AI. In 2020, we'll see a shift to AI and analytic workloads becoming more mainstream in Kubernetes land. "Kubernetifying" the analytics stack will mean solving for data sharing and elasticity by moving data from remote data silos into K8s clusters for tighter data locality.

Hadoop storage (HDFS) is dead. Hadoop compute (Spark) lives strong
There is a lot of talk about Hadoop being dead...but the Hadoop ecosystem has rising stars. Compute frameworks like Spark and Presto extract more value from data and have been adopted into the broader compute ecosystem. Hadoop storage (HDFS) is dead because of its complexity and cost and because compute fundamentally cannot scale elastically if it stays tied to HDFS. For real-time insights, users need immediate and elastic compute capacity that's available in the cloud. Data in HDFS will move to the most optimal and cost efficient system, be it cloud storage or on-prem object storage. HDFS will die but Hadoop compute will live on and live strong.

AI & analytics teams will merge into one as the new foundation of the data organization
Yesterday's Hadoop platform teams are today's AI/analytics teams. Over time, a multitude of ways to get insights on data have emerged. AI is the next step to structured data analytics. What used to be statistical models has converged with computer science to become AI and ML. So data, analytics, and AI teams need to collaborate to derive value from the same data they all use. And this will be done by building the right data stack - storage silos and computes, deployed on-prem, in the cloud, or in both, will be the norm. In 2020 we'll see more organizations building dedicated teams around this data stack.

Talent gap will inhibit data technology adoption
Building the stacks that enable data technology into practice is hard, and this will only become more obvious in 2020. As companies discuss the importance of data in their organizations, they'll need to hire the data, AI, and cloud engineers to architect it. But there aren't enough engineers who have expertise in these technologies to do that. This "super-power" skill is the ability to understand data, structured and unstructured, and pick the right approach to analyze it. Until the knowledge gap closes, we'll continue to see a shortage of these types of engineers - many companies will come up short on their promises of 'data-everywhere'.

China is moving to the cloud on a scale much larger than the US and will leap frog from on-prem to massive cloud deployments for advanced workloads
Over the past 5 year, while enterprises in the US have been moving in leaps and bounds to public clouds, enterprises in China have been investing mostly in on-prem infrastructure primarily for data-driven platform infrastructure. 2020 will be the inflection point where this changes. China will leapfrog into the cloud at a scale much larger than the US by adopting the public cloud for new use cases, bursting in the cloud for peak loads and over time move existing workloads. Public cloud leaders in China will see dramatic growth that might outpace the growth of the current cloud giants.

Tweet this: .@Alluxio Reveals #Technology #Predictions for 2020 #Cloud #AI #ML #Kubernetes #Data #DevOps https://bit.ly/2Ft7arI

About Alluxio

Alluxio is a leading provider of accelerated data access platforms for AI workloads. Alluxio’s distributed caching layer accelerates AI and data-intensive workloads by enabling high-speed data access across diverse storage systems. By creating a global namespace, Alluxio unifies data from multiple sources—on-premises and in the cloud—into a single, logical view, eliminating the need for data duplication or complex data movement.

Designed for scalability and performance, Alluxio brings data closer to compute frameworks like TensorFlow, PyTorch, and Spark, significantly reducing I/O bottlenecks and latency. Its intelligent caching, data locality optimization, and seamless integration with modern data platforms make it a powerful solution for teams building and scaling AI pipelines across hybrid and multi-cloud environments. Backed by leading investors, Alluxio powers technology, internet, financial services, and telecom companies, including 9 out of the top 10 internet companies globally. To learn more, visit www.alluxio.io.

Media Contact:
Amelia Wong
amelia@alluxio.com

About Alluxio

Media Contact:
Beth Winkowski
Winkowski Public Relations, LLC for Alluxio
978-649-7189
beth@alluxio.com

News & Press

Alluxio Helps AI Teams Get More from Every GPU

SAN MATEO, Calif. — JUNE 3, 2026 — Alluxio, the developer of a leading large-scale caching solution for AI, today announced a solution designed to help organizations maximize GPU utilization and improve the efficiency of AI workloads on Oracle Cloud Infrastructure (OCI). By combining Alluxio’s data acceleration capabilities with OCI’s high-performance AI infrastructure, organizations can reduce data bottlenecks and keep GPUs continuously fed with data for training and inference.

Alluxio Closes Strong Q2 with Customer Growth, Sub-Millisecond Latency Capability for AI Data & Record MLPerf Storage v2.0 Benchmark Results

SAN MATEO, Calif., Aug. 27, 2025 -- Alluxio, the AI and data-acceleration platform, today announced strong results for the second quarter of its 2026 fiscal year. During the quarter, the company launched Alluxio Enterprise AI 3.7, a major release that delivers sub-millisecond TTFB (time to first byte) latency for AI workloads accessing data on cloud storage.

Announcing the 2025 Intellyx Digital Innovator Award Winners

AMSTERDAM, NETHERLANDS, JUNE 10, 2025 — In today’s confusing and messy enterprise software market, innovative technology solutions that realize real customer results are hard to come by. As an industry analyst firm that focuses on enterprise digital transformation and the disruptive vendors that support it, Intellyx interacts with numerous innovators in the enterprise IT marketplace.

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo