Products
Speed up large-scale ML/DL offline inference job with Alluxio
April 27, 2021
ALLUXIO DAY III 2021
April 27, 2021
Increasingly powerful compute accelerators and large training dataset have made the storage layer a potential bottleneck in deep learning training/inference.
Offline inference job usually consumes and produces tens of tera-bytes data while running more than 10 hours.
For a large-scale job, it usually causes high IO pressure, increase job failure rate, and bring many challenges for system stability.
We adopt alluxio which acts as an intermediate storage tier between the compute tier and cloud storage to optimize IO throughput of deep learning inference job.
For the production workload, the performance improves 18% and we seldom see job failure because of storage issue.
ALLUXIO DAY III 2021
April 27, 2021
Increasingly powerful compute accelerators and large training dataset have made the storage layer a potential bottleneck in deep learning training/inference.
Offline inference job usually consumes and produces tens of tera-bytes data while running more than 10 hours.
For a large-scale job, it usually causes high IO pressure, increase job failure rate, and bring many challenges for system stability.
We adopt alluxio which acts as an intermediate storage tier between the compute tier and cloud storage to optimize IO throughput of deep learning inference job.
For the production workload, the performance improves 18% and we seldom see job failure because of storage issue.
Video:
Presentation Slides:
Videos:
Presentation Slides:
Complete the form below to access the full overview:
.png)
Videos
AI/ML Infra Meetup | AI at scale Architecting Scalable, Deployable and Resilient Infrastructure

Pratik Mishra delivered insights on architecting scalable, deployable, and resilient AI infrastructure at scale. His discussion on fault tolerance, checkpoint optimization, and the democratization of AI compute through AMD's open ecosystem resonated strongly with the challenges teams face in production ML deployments.
September 30, 2025
AI/ML Infra Meetup | Alluxio + S3 A Tiered Architecture for Latency-Critical, Semantically-Rich Workloads

In this talk, Bin Fan, VP of Technology at Alluxio, presents on building tiered architectures that bring sub-millisecond latency to S3-based workloads. The comparison showing Alluxio's 45x performance improvement over S3 Standard and 5x over S3 Express One Zone demonstrated the critical role the performance & caching layer plays in modern AI infrastructure.
September 30, 2025
AI/ML Infra Meetup | Achieving Double-Digit Millisecond Offline Feature Stores with Alluxio

In this talk, Greg Lindstrom shared how Blackout Power Trading achieved double-digit millisecond offline feature store performance using Alluxio, a game-changer for real-time power trading where every millisecond counts. The 60x latency reduction for inference queries was particularly impressive.
September 30, 2025