A collaboration of Alibaba, Alluxio, and Nanjing University in tackling the problems of Deep Learning model training in the cloud. Our goal was to reduce the cost and complexity of data access for Deep Learning training in a hybrid environment, which resulted in over 40% reduction in training time and cost.
Find our rich collection of White Papers, Case Studies, Presentations, and Videos here.
This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed.
Are you using SQL engines, such as Presto, to query existing Hive data warehouse and experiencing challenges including overloaded Hive Metastore with slow and unpredictable access, unoptimized data formats and layouts such as too many small files, or lack of influence over the existing Hive system and other Hive applications?
RAPIDS is a set of open source libraries enabling GPU aware scheduling and memory representation for analytics and AI. Spark 3.0 uses RAPIDS for … Continued
Increasingly powerful compute accelerators and large training dataset have made the storage layer a potential bottleneck in deep learning training/inference. Offline inference job usually … Continued
Data Lake Analytics(DLA) is a large scale serverless data federation service on Alibaba Cloud. One of its serverless analytics engine is based on Presto. … Continued
Many companies we talk to have on premises data lakes and use the cloud(s) to burst compute. Many are now establishing new object data … Continued
The presentation talks about the best practices to set up and techniques to build a cluster with open source Alluxio on AWS EKS, for … Continued
This whitepaper details how to evaluate Alluxio’s data orchestration platform as a distributed cache for Apache Spark in a public cloud or on-premises. We … Continued