Alluxio Blog

Unifying Cross-region Access in the Cloud at Expedia Group — The Path Toward Data Mesh in the Brand World

July 29, 2022 By Jian Li (Senior Software Engineer @ Expedia Group)

This article shares the data platform practice at Expedia to federate cross-region data lakes spanning multiple geographic regions in the cloud. 1. Background Expedia Group (NASDAQ: EXPE) is an American online travel shopping company for consumer and small business travel. Expedia powers travel for everyone, everywhere through our global platform, with industry-leading technology solutions to … Continued

When AI Meets Alluxio at Bilibili | Building an Efficient AI Platform for Data Preprocessing and Model Training

June 27, 2022 By Lei Li and Zifan Ni

Lei Li, AI Platform Lead, and Zifan Ni, Senior Software Engineer from Bilibili, share how they applied Alluxio to their AI platform to increase training efficiency, as well as best practices including technical architecture and specific tuning tips Overview About Bilibili Bilibili (NASDAQ: BILI) is a leading video community with a mission to enrich the … Continued

Alluxio Block Allocation Policy Explained

June 21, 2022 By Xi Chen

Xi Chen, Senior Software Engineer at Tencent & Top 100 Alluxio open source project contributor, explains the block allocation policy of Alluxio at the code level.

Modernize your analytics workloads with NetApp and Alluxio

June 1, 2022 By Joseph Kandatilparambil

Imagine as an IT leader having the flexibility to choose any services that are available in public cloud and on premises. And imagine being able to scale your storage for your data lakes with control over data locality and protection for your organization. With these goals in mind, NetApp and Alluxio are joining forces to help our customers adapt to new requirements for modernizing data architecture with low-touch operations for analytics, machine learning, and artificial intelligence workflows.

Designing the Presto Local Cache at Uber | A collaboration between Uber and Alluxio – part 2

May 31, 2022 By Chen Liang and Beinan Wang

In the previous blog, we introduced Uber’s Presto use cases and how we collaborated to implement Alluxio local cache to overcome different challenges in accelerating Presto queries. The second part discusses the improvements to the local cache metadata.

Speed Up Uber’s Presto with Alluxio | A collaboration between Uber and Alluxio – part 1

May 24, 2022 By Chen Liang and Beinan Wang

This article shares how Uber and Alluxio collaborated to design and implement Presto local cache to reduce HDFS latency.

Deep Dive into the Implementation of Alluxio Metadata Storage

May 18, 2022 By Changsheng Gu

This article introduces the design and implementation of metadata storage in Alluxio Master, either on heap and off heap (based on RocksDB).

What’s New in Alluxio 2.8: Enhanced S3 API Functionality, Enterprise-grade Security and Data Migration With Better Usability and Low Cost

May 4, 2022 By Adit Madan and Hope Wang

The Alluxio 2.8 version focuses on the S3 API, enterprise-grade security, scalability and observability in data migration. Enhanced S3 API makes managing Alluxio easier than ever. Features such as encryption at rest and policy-driven data management further improve Alluxio’s functionality to support enterprise customers.

From Zookeeper to Raft: How Alluxio Stores File System State with High Availability and Fault Tolerance

April 13, 2022 By Tyler Crain

Raft is an algorithm for state machine replication as a way to ensure high availability (HA) and fault tolerance. This blog shares how Alluxio has moved to a Zookeeper-less, built-in Raft-based journal system as a HA implementation.