In the early 2000s, big data was born, and technology companies were racing to create the next-gen compute frameworks or storage systems geared towards the requirements brought about by big data. By the time I was a first year Ph.D. student at UC Berkeley’s AMPLab in 2011, numerous advances in big data related technologies such as Apache Spark was emerging. Through working on Apache Spark and getting exposed to cutting-edge technologies it became clear that sharing data among data driven applications with different compute frameworks and moving data across storage systems would become the bottleneck for any organization that wants to extract value from their data. To solve these challenges, I created Alluxio (formerly Tachyon), which for the lack of a defined category I called it a virtualized distributed file system in my original thesis. Since then, Alluxio has evolved as the data ecosystem has greatly expanded. We have been seeing the rise of hybrid & multi cloud environments, the fast growth of the AI/ML/DL workloads and technologies, the explosion of object stores, and the eagerness to develop a culture of self-servicing data in all leading companies. All of these advancements further exacerbate the need for greater data mobility and accessibility. As a result, the value that Alluxio brings is ever more critical today.
Today, Alluxio is deployed and trusted by industry leading companies such as China Unicom and Development Bank of Singapore. Some of the large deployments have more than 1,000 nodes in a single Alluxio cluster, powering some of the critical infrastructures in the world. At the same time, our community has grown to 1000+ contributors, and our software can handle billions of files and manage petabyte scale data.
I believe in order for us to take full advantage of the opportunity as the leader in this market and realize our vision, I need a partner in crime so to speak who believes in the vision, has extensive open source go-to-market experience, and shares my passion for creating the future. I have found all those qualities in Steven Mih, who I am thrilled to welcome as our new CEO. I connected with Steven about a year ago, and I’ve greatly enjoyed learning from his experiences and getting to know him. In addition to having deep go-to-market experience, Steven is also an open source veteran having held leadership roles at Couchbase and Mesosphere. With Steven onboard I will be assuming the role of CTO and chairman of the board, doubling down to focus on the technology and product vision, as well as spending time with users, all of which are areas that I am deeply passionate about.
I am more excited today than ever as I believe with Steven onboard, Alluxio is in the perfect position to realize the vision of being the data orchestration layer enabling new technology stacks and serving organizations to unlock the power of data for all. Cheers!
.png)
Blog

Coupang, a Fortune 200 technology company, manages a multi-cluster GPU architecture for their AI/ML model training. This architecture introduced significant challenges, including:
- Time-consuming data preparation and data copy/movement
- Difficulty utilizing GPU resources efficiently
- High and growing storage costs
- Excessive operational overhead maintaining storage for localized data silos
To resolve these challenges, Coupang’s AI platform team implemented a distributed caching system that automatically retrieves training data from their central data lake, improves data loading performance, unifies access paths for model developers, automates data lifecycle management, and extends easily across Kubernetes environments. The new distributed caching architecture has improved model training speed, reduced storage costs, increased GPU utilization across clusters, lowered operational overhead, enabled training workload portability, and delivered 40% better I/O performance compared to parallel file systems.

Suresh Kumar Veerapathiran and Anudeep Kumar, engineering leaders at Uptycs, recently shared their experience of evolving their data platform and analytics architecture to power analytics through a generative AI interface. In their post on Medium titled Cache Me If You Can: Building a Lightning-Fast Analytics Cache at Terabyte Scale, Veerapathiran and Kumar provide detailed insights into the challenges they faced (and how they solved them) scaling their analytics solution that collects and reports on terabytes of telemetry data per day as part of Uptycs Cloud-Native Application Protection Platform (CNAPP) solutions.