data analytics Archives

What’s Next for Data Analytics, AI, and Cloud in 2023?

December 27, 2022 By Bin Fan

Originally published on vmblog.com: https://vmblog.com/archive/2022/12/27/alluxio-2023-predictions-what-s-next-for-data-analytics-ai-and-cloud-in-2023.aspx As we enter 2023, the world of analytics, AI, and cloud is entering an exciting new phase, with a wide range of innovations and developments set to reshape the landscape. Below are some trends that will have the most impact in the coming year. Trend 1: Cloud cost optimization is … Continued

Thousand-Node Alluxio Cluster Powers Game AI Platform – A Production Case Study from Tencent

January 26, 2022 By Bing Zheng, Baolong Mao and Zhizheng Pan

To provide model training with the best experience, Tencent has implemented a 1000-node Alluxio cluster and designed a scalable, robust, and performant architecture to speed up Ceph storage for game AI training. This blog will give you insight into how Alluxio has been implemented and optimized at Tencent.

Thousand-Node Alluxio Cluster Powers Game AI Platform – A Production Case Study from Tencent

January 26, 2022 by Bing Zheng, Baolong Mao & Zhizheng Pa, Tencent

Tencent is one of the largest technology companies in the world and a leader in the gaming sector. The game AI platform supports AI research and development at Tencent. To provide model training with the best experience, Tencent has implemented a 1000-node Alluxio cluster and designed a scalable, robust, and performant architecture to accelerate the game AI training.

Tags: ai, benchmark, case study, data analytics, MODEL TRAINING, performance, storage, tencent

Iceberg + Alluxio for Fast Data Analytics

December 14, 2021

This talk provides an overview of the read-after-write data consistent mechanism in the Alluxio system.

Tags: alluxio day, apache iceberg, data analytics

Modernizing Global Shared Data Analytics Platform and our Alluxio Journey

December 13, 2020

In this keynote, you will learn about the evolution of the global data platform at Rakuten spread across multiple regions, and clouds. In addition, you will hear about the journey across the years, and the use of data orchestration for multiple use cases.

Tags: data analytics, data orchestration, data orchestration summit, rakuten

Building High-Performance Data Lake Using Apache Hudi and Alluxio at T3Go

November 20, 2020 By Trevor Zhang (T3Go), Vino Yang (T3Go), Jasmine Wang and Bin Fan

How T3Go’s high-performance data lake using Apache Hudi and Alluxio shortened the time for data ingestion into the lake by up to a factor of 2. Data analysts using Presto, Hudi, and Alluxio in conjunction to query data on the lake saw queries speed up by 10 times faster.

Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-prem storage

September 29, 2020

In this talk, we describe the architecture to migrate analytics workloads incrementally to any public cloud (AWS, Google Cloud Platform, or Microsoft Azure) directly on on-prem data without copying the data to cloud storage.

Tags: cloud, data analytics, data lake, hdfs, hybrid, on-prem, presto, spark, storage

Tag: data analytics