Speed up large-scale ML/DL offline inference job with Alluxio

April 27, 2021

Binyang Li

Software Engineer

Bing

Qianxi Zhang

Research Software Engineer

MSRA

ALLUXIO DAY III 2021

April 27, 2021

Increasingly powerful compute accelerators and large training dataset have made the storage layer a potential bottleneck in deep learning training/inference.

Offline inference job usually consumes and produces tens of tera-bytes data while running more than 10 hours.

For a large-scale job, it usually causes high IO pressure, increase job failure rate, and bring many challenges for system stability.

We adopt alluxio which acts as an intermediate storage tier between the compute tier and cloud storage to optimize IO throughput of deep learning inference job.

For the production workload, the performance improves 18% and we seldom see job failure because of storage issue.

ALLUXIO DAY III 2021

April 27, 2021

Increasingly powerful compute accelerators and large training dataset have made the storage layer a potential bottleneck in deep learning training/inference.

Offline inference job usually consumes and produces tens of tera-bytes data while running more than 10 hours.

For a large-scale job, it usually causes high IO pressure, increase job failure rate, and bring many challenges for system stability.

We adopt alluxio which acts as an intermediate storage tier between the compute tier and cloud storage to optimize IO throughput of deep learning inference job.

For the production workload, the performance improves 18% and we seldom see job failure because of storage issue.

Video:

Presentation Slides:

Speed up large-scale ML/DL offline inference job with Alluxio from Alluxio, Inc.

‍

Videos:

Presentation Slides:

Speed up large-scale ML/DL offline inference job with Alluxio from Alluxio, Inc.

Video:

Presentation Slides:

Speed up large-scale ML/DL offline inference job with Alluxio from Alluxio, Inc.

‍

Videos:

Presentation Slides:

Speed up large-scale ML/DL offline inference job with Alluxio from Alluxio, Inc.

Complete the form below to access the full overview:

Videos

AI/ML Infra Meetup Accelerating the Data Path to the GPU for AI and Beyond

In this talk, Sandeep Joshi, , Senior Manager at NVIDIA, shares how to accelerate the data access between GPU and storage for AI. Sandeep will dive into two options: CPU- initiated GPUDirect Storage and GPU-initiated SCADA.

August 14, 2025

AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access

Bin Fan, VP of Technology at Alluxio, introduces how Alluxio, a software layer transparently sits between application and S3 (or other object stores), provides sub-ms time to first byte (TTFB) solution, with up to 45x lower latency.

August 14, 2025

AI/ML Infra Meetup | LLM Agents and Implementation Challenges

In this talk, Pritish Udgata from Adobe provides a comprehensive overview of implementation challenges and solutions for LLM agents.

Topic include:

CoT vs RAG vs Agentic AI
Anatomy of an agent
Single Agent with MCP
Multi Agents with A2A
Implementation Challenges and Solutions

August 14, 2025

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo

Alluxio Enterprise AI

Alluxio Enterprise Data

ALLUXIO DAY III 2021

ALLUXIO DAY III 2021

Videos:

Presentation Slides:

Videos:

Presentation Slides:

Complete the form below to access the full overview:

Videos

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer