On Demand Video

Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud

In this session, cloud optimization specialists Chunxu and Siyuan break down the challenges and present a fresh architecture designed to optimize I/O across the data pipeline, ensuring GPUs function at peak performance. The integrated solution of PyTorch/Ray + Alluxio + S3 offers a promising way forward, and the speakers delve deep into its practical applications. Attendees will not only gain theoretical insights but will also be treated to hands-on instructions and demonstrations of deploying this cutting-edge architecture in Kubernetes, specifically tailored for Tensorflow/PyTorch/Ray workloads in the public cloud.

Chunxu Tang
 is a Research Scientist at Alluxio and a committer of PrestoDB, working on developing distributed data systems for interactive data analytics and machine learning workloads. Prior to Alluxio, he served as a Senior Software Engineer on Twitter’s data platform and machine learning infrastructure. He received his Ph.D. from Syracuse University, where he conducted research on distributed collaboration systems and machine learning applications.

Siyuan Sheng
is a senior software engineer at Alluxio. Previously, he has worked as a Software engineer in Rubrik’s Appflows team. Siyuan received his MS of Computer Science from CMU. He also loves snowboarding during his spare time.