Speeding up I/O for Machine Learning ft Apple Case Study using TensorFlow, NFS, DC OS, & Alluxio

Tags: , , , , ,

feat. Apple Case Study using Tensorflow, NFS, DC/OS, and Alluxio

ALLUXIO ONLINE MEETUP

Data scientists or platform engineers often face the following challenge when the input data for machine learning jobs are stored in remote storage like NFS or cloud storage like S3. Making direct data access is slow, unstable and expensive; manually duplicating data to the training clusters also introduces large overhead, complicated data curation and often requires engineers to build ETL pipelines.

This talk will guide the audience on how Alluxio can greatly simplify the data preparation phase in with remote and possibly multiple data sources. We will share the lessons and benchmark from Bill Zhao an engineer led in Apple when building a Machine Learning platform using Tensorflow, NFS, DC/OS and Alluxio. 

In this online meetup, you will learn about:

  • When Alluxio can help for machine learning platform;
  • How to setup and create POSIX endpoint for Alluxio service to unify the file system data access to S3, HDFS and Azure blob storage;
  • How to run TensorFlow to train models backed by accessing remote input data like access local file system.

Speakers:

Bill Zhao is a technical leader in large-scale workloads w/ General Purpose GPU, such as distributed deep-learning, deep reinforcement learning, and big data analytics. Prior to Apple, he was a big-data researcher at UC Berkeley AMP Lab under the supervision of David Patterson, Ion Stoica, Anthony Joseph, and an early contributor to widely used datacenter software such as Apache Mesos, Spark, and Alluxio. Plus, he helps Stanford DAWNBench, an ML performance benchmark and an early contributor to industry-standard MLPerf benchmark. Bill holds an MS/BA degree in Computer Science from the University of California at Berkeley.

Bin Fan is the founding engineer of Alluxio, Inc. and the PMC member of Alluxio open source project. Prior to Alluxio, he worked for Google where he won the Technical Infrastructure Award. Bin received his Ph.D. in Computer Science from Carnegie Mellon University working on distributed systems

Questions? Slack with the speakers, users, and many other community members!
Welcome to join Alluxio Global Online Meetup Group to attend online meetups like this!

Video:

Slides: