Tuesday April 1, 11am PT

Deepseek’s recent announcement of the Fire-flyer File System (3FS) has sparked excitement across the AI infra community, promising a breakthrough in how machine learning models access and process data.
In this webinar, an expert in distributed systems and AI infrastructure will take you inside Deepseek 3FS, the purpose-built file system for handling large files and high-bandwidth workloads. We’ll break down how 3FS optimizes data access and speeds up AI workloads as well as the design tradeoffs made to maximize throughput for AI workloads.
This webinar you’ll learn about how 3FS works under the hood, including:
✅ The system architecture
✅ Core software components
✅ Read/write flows
✅ Data distribution/placement algorithms
✅ Cluster/node management and disaster recovery
Whether you’re an AI researcher, ML engineer, or infrastructure architect, this deep dive will give you the technical insights you need to determine if 3FS is the right solution for you.
Speaker Bio
Stephen Pu, Staff Software Engineer at Alluxio, has over 15 years of experience in software R&D for data centers and distributed storage systems. He has been involved in the core product development and design of large-scale distributed data platforms at IBM, HPE, and Fortinet. Stephen has deep expertise in the performance, scalability, and reliability of distributed data systems, with a strong understanding of architectural design in these areas.
Sign up to the event
Thank you for registering for the webinar! You’ll receive the Zoom link via email shortly.
.png)
Events

Join us to learn about the latest release of Alluxio Enterprise AI. In this webinar, discover how Alluxio AI 3.7 eliminates cloud storage latency bottlenecks with breakthrough sub-millisecond performance, delivering up to 45× faster data access than S3 Standard without changing your code. Alluxio AI 3.7 is also packed with new features designed to supercharge your AI infrastructure while keeping your data secure.
Key highlights include:
- Alluxio Ultra Low Latency Caching for Cloud Storage
- Role-Based Access Control (RBAC) for S3 Access
- 5X Faster Cache Preloading with Alluxio Distributed Cache Preloader
- FUSE Non-Disruptive Upgrade
- Other New Features for Alluxio Admins

Storing data as Parquet files on cloud object storage, such as AWS S3, has become prevalent not only for large-scale data lakes but also as lightweight feature stores for training and inference, or as document stores for Retrieval-Augmented Generation (RAG). However, querying petabyte-to-exabyte-scale data lakes directly from S3 remains notoriously slow, with latencies typically ranging from hundreds of milliseconds to several seconds.
In this webinar, David Zhu, Software Engineering Manager at Alluxio, will present the results of a joint collaboration between Alluxio and a leading SaaS and data infrastructure enterprise that explored leveraging Alluxio as a high-performance caching and acceleration layer atop AWS S3 for ultra-fast querying of Parquet files at PB scale.
David will share:
- How Alluxio delivers sub-millisecond Time-to-First-Byte (TTFB) for Parquet queries, comparable to S3 Express One Zone, without requiring specialized hardware, data format changes, or data migration from your existing data lake.
- The architecture that enables Alluxio’s throughput to scale linearly with cluster size, achieving one million queries per second on a modest 50-node deployment, surpassing S3 Express single-account throughput by 50x without latency degradation.
- Specifics on how Alluxio offloads partial Parquet read operations and reduces overhead, enabling direct, ultra-low-latency point queries in hundreds of microseconds and achieving a 1,000x performance gain over traditional S3 querying methods.
Speaker: David Zhu
David Zhu is a Software Engineer Manager at Alluxio. At Alluxio, David focuses on metadata management and end-to-end performance benchmarking and optimizations. Prior to that, David completed his Ph.D. from UC Berkeley, with a focus on distributed data management systems and operating systems for the data center. David also holds a Bachelor of Software Engineering from the University of Waterloo.