This video is originally published on TechArena.
At NVIDIA GTC 2025, Bin Fan from Alluxio and Scott Shadley from Solidigm tackled the growing need for decoupled storage and compute in AI infrastructure. They explained how Alluxio's caching layer enables fast, scalable and reliable infrastructure, accelerating AI training and inferencing seamlessly across regions and platforms.
This video is originally published on TechArena.
At NVIDIA GTC 2025, Bin Fan from Alluxio and Scott Shadley from Solidigm tackled the growing need for decoupled storage and compute in AI infrastructure. They explained how Alluxio's caching layer enables fast, scalable and reliable infrastructure, accelerating AI training and inferencing seamlessly across regions and platforms.
Videos:
Presentation Slides:
Complete the form below to access the full overview:
.png)
Videos
In this talk, Sandeep Joshi, , Senior Manager at NVIDIA, shares how to accelerate the data access between GPU and storage for AI. Sandeep will dive into two options: CPU- initiated GPUDirect Storage and GPU-initiated SCADA.
Bin Fan, VP of Technology at Alluxio, introduces how Alluxio, a software layer transparently sits between application and S3 (or other object stores), provides sub-ms time to first byte (TTFB) solution, with up to 45x lower latency.
In this talk, Pritish Udgata from Adobe provides a comprehensive overview of implementation challenges and solutions for LLM agents.
Topic include:
- CoT vs RAG vs Agentic AI
- Anatomy of an agent
- Single Agent with MCP
- Multi Agents with A2A
- Implementation Challenges and Solutions