Alluxio, formerly Tachyon, began as a research project at UC Berkeley’s AMPLab in 2012. This year we announced the 1.0 release of Alluxio, the world’s first memory speed virtual distributed storage system, which unifies data access and bridges computation frameworks and underlying storage systems. We have been working closely with the Alluxio community on realizing the vision of Alluxio to become the de facto storage unification layer for big data and other scale out application environments. Today, we’re excited to announce that the Alluxio open source project is adopting the Benevolent Dictator For Life (BDFL) model. The day-to-day management of the project will be carried out by the Project Management Committee (PMC). Within the PMC, there are Maintainers, who are responsible for upholding the quality of the code in their respective components. You can find the list of initial PMC members and maintainers here. As the co-creator of the Alluxio open source project, I have been shepherding the project since open-sourcing it in 2013, and I will assume the role of the BDFL. The Alluxio community is growing faster than ever. With the adoption of a project management mechanism, we believe it will further accelerate the project growth and enable contributors around the world to easily collaborate to bring exciting new features and improvements to Alluxio. If you would like to join the Alluxio community, you can take your first step here.
.png)
Blog

In this blog, Greg Lindstrom, Vice President of ML Trading at Blackout Power Trading, an electricity trading firm in North American power markets, shares how they leverage Alluxio to power their offline feature store. This approach delivers multi-join query performance in the double-digit millisecond range, while maintaining the cost and durability benefits of Amazon S3 for persistent storage. As a result, they achieved a 22 to 37x reduction in large-join query latency for training and a 37 to 83x reduction in large-join query latency for inference.

In the latest MLPerf Storage v2.0 benchmarks, Alluxio demonstrated how distributed caching accelerates I/O for AI training and checkpointing workloads, achieving up to 99.57% GPU utilization across multiple workloads that typically suffer from underutilized GPU resources caused by I/O bottlenecks.