Developer and Engineering Archives | Page 2 of 12

Cross Cluster Synchronization in Alluxio – Part 3: Discussions and Conclusion

February 8, 2023 By Tyler Crain

Following part 1 and part 2, this final blog of the series discusses some design decisions and details, as well as certain future work. Discussions and Future Work Why not exactly once delivery for pub/sub? As we know, exactly once message delivery for pub/sub would greatly simplify our design and there do exist many powerful … Continued

Cross Cluster Synchronization in Alluxio – Part 2: Mechanism

February 8, 2023 By Tyler Crain

This is part 2 of the blog series talking about the design and implementation of the Cross Cluster Synchronization mechanism in Alluxio. In the previous blog, we discussed the scenario, background and how metadata sync is done with a single Alluxio cluster. This blog will describe how metadata sync is built upon to provide metadata … Continued

Cross Cluster Synchronization in Alluxio – Part 1: Scenarios and Background

February 8, 2023 By Tyler Crain

This is a blog series talking about the design and implementation of the Cross Cluster Synchronization mechanism in Alluxio. This mechanism ensures that the metadata is consistent when running multiple Alluxio clusters. Part 1 of this blog series discusses the scenario and background. Alluxio lies in between the storage and compute layers in order to … Continued

Tutorial of Building Multi-Cloud Data Lake using Delta Lake and Alluxio

October 25, 2022 By Zijian Zhu and Hope Wang

This article introduces how to read and write Delta lake tables on Alluxio. You can build multi-cloud data lake using Delta Lake and Alluxio, reducing your data storage costs and increasing flexibility 1. Overview 1.1 About Delta Lake Delta Lake is an open source storage framework that enables building a Lakehouse architecture and brings reliability … Continued

Avoid Data Silos in Presto in Meta: the journey from Raptor to RaptorX

August 29, 2022 By Rongrong Zhong

This blog was originally published in the Presto blog: https://prestodb.io/blog/2022/01/28/avoid-data-silos-in-presto-in-meta Alluxio: Rongrong Zhong Meta: James Sun, Ke Wang Raptor is a Presto connector (presto-raptor) that is used to power some critical interactive query workloads in Meta (previously Facebook). Though referred to in the ICDE 2019 paper Presto: SQL on Everything, it remains somewhat mysterious to many Presto users … Continued

Alluxio Block Allocation Policy Explained

June 21, 2022 By Xi Chen

Xi Chen, Senior Software Engineer at Tencent & Top 100 Alluxio open source project contributor, explains the block allocation policy of Alluxio at the code level.

Modernize your analytics workloads with NetApp and Alluxio

June 1, 2022 By Joseph Kandatilparambil

Imagine as an IT leader having the flexibility to choose any services that are available in public cloud and on premises. And imagine being able to scale your storage for your data lakes with control over data locality and protection for your organization. With these goals in mind, NetApp and Alluxio are joining forces to help our customers adapt to new requirements for modernizing data architecture with low-touch operations for analytics, machine learning, and artificial intelligence workflows.

Designing the Presto Local Cache at Uber | A collaboration between Uber and Alluxio – part 2

May 31, 2022 By Chen Liang and Beinan Wang

In the previous blog, we introduced Uber’s Presto use cases and how we collaborated to implement Alluxio local cache to overcome different challenges in accelerating Presto queries. The second part discusses the improvements to the local cache metadata.

Speed Up Uber’s Presto with Alluxio | A collaboration between Uber and Alluxio – part 1

May 24, 2022 By Chen Liang and Beinan Wang

This article shares how Uber and Alluxio collaborated to design and implement Presto local cache to reduce HDFS latency.

Category: Developer and Engineering