Tech Talk: Accelerate Spark Workloads on S3

June 28, 2019

Dipti Borkar

VP of Product

Alluxio

While running analytics workloads using EMR Spark on S3 is a common deployment today, many organizations face issues in performance and consistency. EMR can be bottlenecked when reading large amounts of data from S3, and sharing data across multiple stages of a pipeline can be difficult as S3 is eventually consistent for read-your-own-write scenarios.

A simple solution is to run Spark on Alluxio as a distributed cache for S3. Alluxio stores data in memory close to Spark, providing high performance, in addition to providing data accessibility and abstraction for deployments in both public and hybrid clouds.

In this webinar you’ll learn how to:

Increase performance by setting up Alluxio so Spark can seamlessly read from and write to S3
Use Alluxio as the input/output for Spark applications
Save and load Spark RDDs and Dataframes with Alluxio

In this webinar you’ll learn how to:

Increase performance by setting up Alluxio so Spark can seamlessly read from and write to S3
Use Alluxio as the input/output for Spark applications
Save and load Spark RDDs and Dataframes with Alluxio

Videos:

Presentation Slides:

Tech Talk: Accelerate Spark Workloads on S3 from Alluxio, Inc.

Videos:

Presentation Slides:

Tech Talk: Accelerate Spark Workloads on S3 from Alluxio, Inc.

Complete the form below to access the full overview:

Videos

AI/ML Infra Meetup Accelerating the Data Path to the GPU for AI and Beyond

In this talk, Sandeep Joshi, , Senior Manager at NVIDIA, shares how to accelerate the data access between GPU and storage for AI. Sandeep will dive into two options: CPU- initiated GPUDirect Storage and GPU-initiated SCADA.

August 14, 2025

AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access

Bin Fan, VP of Technology at Alluxio, introduces how Alluxio, a software layer transparently sits between application and S3 (or other object stores), provides sub-ms time to first byte (TTFB) solution, with up to 45x lower latency.

August 14, 2025

AI/ML Infra Meetup | LLM Agents and Implementation Challenges

In this talk, Pritish Udgata from Adobe provides a comprehensive overview of implementation challenges and solutions for LLM agents.

Topic include:

CoT vs RAG vs Agentic AI
Anatomy of an agent
Single Agent with MCP
Multi Agents with A2A
Implementation Challenges and Solutions

August 14, 2025

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo

Alluxio Enterprise AI

Alluxio Enterprise Data

Videos:

Presentation Slides:

Videos:

Presentation Slides:

Complete the form below to access the full overview:

Videos

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer