A collaboration of Alibaba, Alluxio, and Nanjing University in tackling the problems of Deep Learning model training in the cloud. Our goal was to reduce the cost and complexity of data access for Deep Learning training in a hybrid environment, which resulted in over 40% reduction in training time and cost.
Alluxio Resources
Find our rich collection of White Papers, Case Studies, Presentations, and Videos here.




This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed.
Are you using SQL engines, such as Presto, to query existing Hive data warehouse and experiencing challenges including overloaded Hive Metastore with slow and unpredictable access, unoptimized data formats and layouts such as too many small files, or lack of influence over the existing Hive system and other Hive applications?
For many latency-sensitive SQL workloads, Presto is often bound by retrieving distant data. In this talk, Rohit Jain from Facebook will introduce their teams’ … Continued
This is an open source community conference focused on the key data engineering challenges and solutions around building cloud-native data and AI platforms using … Continued
This is an open source community conference focused on the key data engineering challenges and solutions around building cloud-native data and AI platforms using … Continued
This is an open source community conference focused on the key data engineering challenges and solutions around building cloud-native data and AI platforms using … Continued
In this talk, we will describe how we have solved an issue with large S3 API costs incurred by Presto under several usage concurrency … Continued
Electronic Arts (EA) is a leading company in the gaming industry, providing over a thousand games to serve billions of users worldwide. The EA … Continued
In this talk, we will present how using Alluxio computation and storage ecosystems can better interact benefiting the "bringing the data close to the … Continued
Alluxio 2.4.0 focuses on features critical to large scale, production deployments in Cloud and Hybrid Cloud environments. Features such as highly scalable metadata journaling, … Continued
Many companies we talk to have on premises data lakes and use the cloud(s) to burst compute. Many are now establishing new object data … Continued