Products
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
January 29, 2024
As a cache eviction algorithm, FIFO has a lot of attractive properties, such as simplicity, speed, scalability, and flash-friendliness. The most prominent criticism of FIFO is its low efficiency (high miss ratio). In this talk, Juncheng Yangb describes a simple, scalable FIFO-based algorithm with three static queues (S3-FIFO). Evaluated on 6594 cache traces from 14 datasets, we show that S3- FIFO has lower miss ratios than state-of-the-art algorithms across traces. Moreover, S3-FIFO’s efficiency is robust — it has the lowest mean miss ratio on 10 of the 14 datasets. FIFO queues enable S3-FIFO to achieve good scalability with 6× higher throughput compared to optimized LRU at 16 threads. Our insight is that most objects in skewed workloads will only be accessed once in a short window, so it is critical to evict them early (also called quick demotion). The key of S3-FIFO is a small FIFO queue that filters out most objects from entering the main cache, which provides a guaranteed demotion speed and high demotion precision.
As a cache eviction algorithm, FIFO has a lot of attractive properties, such as simplicity, speed, scalability, and flash-friendliness. The most prominent criticism of FIFO is its low efficiency (high miss ratio). In this talk, Juncheng Yangb describes a simple, scalable FIFO-based algorithm with three static queues (S3-FIFO). Evaluated on 6594 cache traces from 14 datasets, we show that S3- FIFO has lower miss ratios than state-of-the-art algorithms across traces. Moreover, S3-FIFO’s efficiency is robust — it has the lowest mean miss ratio on 10 of the 14 datasets. FIFO queues enable S3-FIFO to achieve good scalability with 6× higher throughput compared to optimized LRU at 16 threads. Our insight is that most objects in skewed workloads will only be accessed once in a short window, so it is critical to evict them early (also called quick demotion). The key of S3-FIFO is a small FIFO queue that filters out most objects from entering the main cache, which provides a guaranteed demotion speed and high demotion precision.
Video:
Presentation slides:
Videos:
Presentation Slides:
Complete the form below to access the full overview:
.png)
Videos
AI/ML Infra Meetup | AI at scale Architecting Scalable, Deployable and Resilient Infrastructure

Pratik Mishra delivered insights on architecting scalable, deployable, and resilient AI infrastructure at scale. His discussion on fault tolerance, checkpoint optimization, and the democratization of AI compute through AMD's open ecosystem resonated strongly with the challenges teams face in production ML deployments.
September 30, 2025
AI/ML Infra Meetup | Alluxio + S3 A Tiered Architecture for Latency-Critical, Semantically-Rich Workloads

In this talk, Bin Fan, VP of Technology at Alluxio, presents on building tiered architectures that bring sub-millisecond latency to S3-based workloads. The comparison showing Alluxio's 45x performance improvement over S3 Standard and 5x over S3 Express One Zone demonstrated the critical role the performance & caching layer plays in modern AI infrastructure.
September 30, 2025
AI/ML Infra Meetup | Achieving Double-Digit Millisecond Offline Feature Stores with Alluxio

In this talk, Greg Lindstrom shared how Blackout Power Trading achieved double-digit millisecond offline feature store performance using Alluxio, a game-changer for real-time power trading where every millisecond counts. The 60x latency reduction for inference queries was particularly impressive.
September 30, 2025