unified namespace Archives

Metadata Synchronization in Alluxio: Design, Implementation and Optimization

December 14, 2021 By David Zhu

Metadata synchronization (sync) is a core feature in Alluxio that keeps files and directories consistent with their source of truth in under storage systems, thus making it simple for users to reason the data retrieved from Alluxio. Meanwhile, understanding the internal process is important in order to tune the performance. This article describes the design and the implementation in Alluxio to keep metadata synchronized.

Accelerating Analytics by 200% with Impala, Alluxio, and HDFS at Tencent

June 22, 2020 By Honghan Tian (Tencent)

This article describes how engineers in the Data Service Center at Tencent PCG leverages Alluxio to optimize the analytics performance by 200% and minimize the operating cost in building Tencent Beacon Growing, a real-time data analytics platform.

How to Build a new Under Filesystem in Alluxio: Apache Ozone as an Example

Alluxio Global Online Meetup * June 30, 2020

In Alluxio, an Under File System is the plugin to connect to any file systems or object stores, so users can mount different storages like AWS S3 or HDFS into Alluxio namespace. This under filesystem is designed to be modular, in order to enable users to easily extend this framework with their own Under File System implementation and connect to a new or customized storage system.

Enabling big data & AI workloads on the object store at DBS

October 14, 2019

Vitaliy and Dipti dive into how DBS Bank built a modern big data analytics stack, leveraging an object store as persistent storage even for data-intensive workloads, and how it uses Alluxio to orchestrate data locality and data access for Spark workloads.

Tags: aws, big data, conference, hybrid cloud bursting, object stores, unified namespace

Scalable Filesystem Metadata Services with RocksDB

July 22, 2019

Alluxio maintainer and founding engineer Calvin Jia presents on Scalable Filesystem Metadata Services with RocksDB at the RocksDB meetup at Twitter.

Tags: alluxio engineering, meetup, metadata management, performance, scale, storage, unified namespace

Getting Started with the Alluxio-Presto Sandbox

July 11, 2019 By Zac Blanco

The Alluxio-Presto sandbox is a docker application featuring installations of MySQL, Hadoop, Hive, Presto, and Alluxio. The sandbox lets you easily dive into an interactive environment where you can explore Alluxio, run queries with Presto, and see the performance benefits of using Alluxio in a big data software stack.

Scalable Metadata Service in Alluxio: Storing Billions of Files

May 10, 2019 By Andrew Audibert

We are writing several engineering blogs describing the design and implementation of Alluxio master to address this scalability challenge. This is the first article focusing on metadata storage and service, particularly how to use RocksDB as an embedded persistent key-value store to encode and store the file system inode tree with high performance.
Alluxio serves its metadata from a single active master as the primary and potentially multiple standby master for high availability. The master handles all metadata requests and uses a write-ahead log to journal all changes so that we can recover from crashes. The log is typically written to shared storage like HDFS for persistence and availability. Standby masters read the write-ahead log to keep their own state up-to-date. If the primary master dies, one of the standbys can quickly take over for it.

Tag: unified namespace