Founder Blog: Alluxio formerly Tachyon is Entering a New Era with 1.0 release

February 13, 2016

Haoyuan Li

Alluxio, formerly Tachyon, began as a research project when I was a Ph.D. student at UC Berkeley’s AMPLab in 2012. At the time, Spark and Mesos were taking off. We saw what Spark and Mesos could do for compute and resource management respectively, while the storage piece of this story was missing. Together with my research group, we started investigating how to enable memory speed data sharing across different applications.

I built the first version of Alluxio during Christmas of 2012, and open sourced it in April 2013. Two years later, Alluxio, Inc. was founded, receiving a $7.5 million investment from Andreessen Horowitz, to realize the vision of Alluxio becoming the de facto storage unification layer for big data and other scale out application environments and to provide a commercial backer for the project.

Today, we are very excited to announce the 1.0 release of Alluxio, the world’s first memory-centric virtual distributed storage system, which unifies data access and bridges computation frameworks and underlying storage systems. Applications only need to connect with Alluxio to access data stored in any underlying storage systems. Additionally, Alluxio’s memory-centric architecture enables data access orders of magnitude faster than existing solutions.

Now, organizations can run any computation framework (Apache Spark, Apache MapReduce, Apache Flink, etc.) with any storage system (Alibaba OSS, Amazon S3, OpenStack Swift, GlusterFS, Ceph, etc.), leveraging any storage media (DRAM, SSD, HDD, etc.).

What we have accomplished

Over the past three years, Alluxio has evolved from a small codebase for research prototyping into a stable and reliable system, with a vibrant community, deployed by companies around the world. Since the first open source release, our community has grown from 1 contributor to more than 200 contributors from over 50 companies. There are production deployments of Alluxio with hundreds of machines. Our meetup group has grown to more than 800 people, and the most recent meetup had over 300 registrants. The number of commits has grown from 200 to more than 12,000.

Beyond the numbers, we have seen that Alluxio is solving critical problems in enterprises across different industries around the world. For example, search giant Baidu has been running Alluxio in their production for more than a year. Alluxio brings them 30X performance improvement. Barclays, a world leading bank, uses Alluxio to make the impossible possible, by reducing the end-to-end latency from hours to seconds. Public cloud providers such as Alibaba and RackSpace have also shown how Alluxio virtualizes their object storage systems. Intel has published articles to showcase several ways to leverage Alluxio in their customers’ environments. IBM presented how Alluxio can abstract OpenStack storage to enable fast data analytics.

It has been exciting to see the project adoption grow from zero companies to many, including various industry leaders. The achievements thus far validate the tremendous potential of Alluxio and demonstrate the industry and community’s great excitement around it.

What to look forward to

Alluxio and its community have grown tremendously in many aspects over the past three years. With the increase in adoption of Alluxio and the growing community, we established a nonprofit organization, Alluxio Open Foundation, to provide a better venue for the project -- stay tuned for the details. The project has been rebranded from Tachyon to Alluxio to protect it from potential trademark litigation and to preserve the intellectual property of the open source software community’s contributions internationally.

Furthermore, in response to growing demand for a forum to communicate and learn about Alluxio, we are planning for the first Alluxio Conference to take place later this year in the San Francisco Bay Area. If you are interested in presenting, attending, or sponsoring, please let us know.

Kudos to the Alluxio community for all we have achieved. Let us look forward to the future!

Blog

Alluxio AI 3.8: Two New Breakthrough Features for Faster Object Storage Writes and Faster Model Loading

Learn about the new features in Alluxio AI 3.8 designed to eliminate two of the most painful bottlenecks in modern AI pipelines. Introducing Alluxio S3 Write Cache, which dramatically reduces object store write latency and improves write-heavy workload performance, and Safetensors Model Loading Acceleration that delivers near-local NVMe throughput for model weight loading

‍

Introducing Alluxio S3 Write Cache

For write-heavy AI and analytics workloads, cloud object storage can become the primary bottleneck. This post introduces how Alluxio S3 Write Cache decouples performance from backend limits, reducing write latency up to 8X - down to ~4–6 ms for concurrent and bursty PUT workloads.

Alluxio and Oracle Cloud Infrastructure: Delivering Sub-Millisecond Latency for AI Workloads

Oracle Cloud Infrastructure has published a technical solution blog demonstrating how Alluxio on Oracle Cloud Infrastructure (OCI) delivers exceptional performance for AI and machine learning workloads, achieving sub-millisecond average latency, near-linear scalability, and over 90% GPU utilization across 350 accelerators.

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo