benchmark Archives

Thousand-Node Alluxio Cluster Powers Game AI Platform – A Production Case Study from Tencent

January 26, 2022 By Bing Zheng, Baolong Mao and Zhizheng Pan

To provide model training with the best experience, Tencent has implemented a 1000-node Alluxio cluster and designed a scalable, robust, and performant architecture to speed up Ceph storage for game AI training. This blog will give you insight into how Alluxio has been implemented and optimized at Tencent.

Thousand-Node Alluxio Cluster Powers Game AI Platform – A Production Case Study from Tencent

January 26, 2022 by Bing Zheng, Baolong Mao & Zhizheng Pa, Tencent

Tencent is one of the largest technology companies in the world and a leader in the gaming sector. The game AI platform supports AI research and development at Tencent. To provide model training with the best experience, Tencent has implemented a 1000-node Alluxio cluster and designed a scalable, robust, and performant architecture to accelerate the game AI training.

Tags: ai, benchmark, case study, data analytics, MODEL TRAINING, performance, storage, tencent

Machine Learning Model Training with Alluxio: Part 3 – Benchmarking

January 18, 2022 By Lu Qiu and Bin Fan

This blog is the last one in the machine learning series. Our first blog introduced the what and why of our solution, and the second blog compared traditional and Alluxio solutions. This blog will demonstrate how to set up and benchmark the end-to-end performance of the training process.

Accelerating Machine Learning / Deep Learning in the Cloud: Architecture and Benchmark

December 7, 2021

This whitepaper introduces how to speed up end-to-end distributed training in the cloud using Alluxio to accelerate data access. With the help of Alluxio, loading data from cloud storage, training and caching data can be done in a transparent and distributed way as a part of the training process. This whitepaper also demonstrates how to set up and benchmark the end-to-end performance of the training process, along with a comparison of other popular approaches.

Tags: benchmark, cache, cloud, data orchestration, deep learning, distributed training, machine learning, performance, storage

Building a high-performance platform on AWS to support real-time gaming services using Presto and Alluxio

August 4, 2020 By Teng Wang (Electronic Arts), Du Li (Electronic Arts), Yu Jin (Electronic Arts) and Sundeep Narravula (Electronic Arts)

This blog explores an innovative platform with Presto as the computing engine and Alluxio as a data orchestration layer between Presto and S3 storage, to support online services with instantaneous response within the gaming industry. The preliminary results show that Presto with Alluxio outperforms S3 significantly in all cases.Alluxio with metadata caching shows up to 5.9x performance gain when handling large numbers of small files.

Reducing Large S3 API Costs Using Alluxio

July 30, 2020 By Juraj Pohanka (datasapiens), Koen Michiels (datasapiens) and Sam Gilbert (datasapiens)

This article described how engineers at datasapiens brought down S3 API costs by 200x by implementing Alluxio as a data orchestration layer between S3 and Presto.

Accelerating and Scaling Big Data Analytics with Alluxio and Intel® Optane™ Persistent Memory

May 8, 2020 By Jian Zhang (Intel Corporation), Eugene Ma (Intel Corporation) and Bin Fan

International Data Corporation (IDC) reported that the global datasphere will grow from 33 zettabytes in 2018 to 175 zettabytes by 2025. This trend becomes more and more complicated with the variety and velocity of data growth, and it continuously changes the ways how data is collected, stored, processed and analyzed. New analytics solutions from machine … Continued

Alluxio Accelerates Deep Learning in Hybrid Cloud using Intel’s Analytics Zoo open source platform powered by oneAPI

April 28, 2020

This article describes how Alluxio accelerates the training of deep learning models in a hybrid cloud environment with Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed.

Tags: analytics, analytics zoo, benchmark, big data, cloud, deep learning applications, hybrid cloud, intel, spark

Alluxio Accelerates Deep Learning in Hybrid Cloud using Intel’s Analytics Zoo open source platform powered by oneAPI

April 27, 2020 By Bin Fan

This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed.

Tag: benchmark