Fast Big Data Analytics + Machine Learning Using Alluxio & Spark in Baidu

STRATA+HADOOP WORLD SAN JOSE 2016

A few months ago, Baidu deployed Alluxio to accelerate its big data analytics workload. Bin Fan and Haojun Wang explain why Baidu chose Alluxio, as well as the details of how they achieved a 30x speedup with Alluxio in their production environment with hundreds of machines. Based on the success of the big data analytics engine, Baidu is currently expanding the Alluxio and Spark infrastructure to accelerate other applications, such as machine learning.

Bin and Haojun also delve into how they built a heterogenous computing platform to accelerate deep learning workloads. This platform consists of heterogeneous computing resources (CPU, GPU, FPGA) managed by a heterogeneous computing layer, as well as heterogeneous storage resources (memory, SSD, HDD) managed by Alluxio.

Speakers
Bin Fan, VP Open Source and Founding Engineer at Alluxio
Haoujun Wang, Tech Lead at Baidu

View