Author: David Zhu at Alluxio

Metadata Synchronization in Alluxio: Design, Implementation and Optimization

December 14, 2021 By David Zhu

Metadata synchronization (sync) is a core feature in Alluxio that keeps files and directories consistent with their source of truth in under storage systems, thus making it simple for users to reason the data retrieved from Alluxio. Meanwhile, understanding the internal process is important in order to tune the performance. This article describes the design and the implementation in Alluxio to keep metadata synchronized.

Accelerating Analytics and AI with Alluxio and NVIDIA GPUs

March 23, 2021 By Dong Meng, Adit Madan and David Zhu

Data processing is increasingly making use of NVIDIA computing for massive parallelism. Advancements in accelerated compute mean that access to storage must also be quicker, whether in analytics, artificial intelligence (AI), or machine learning (ML) pipelines.

Two Ways to Keep Files in Sync Between Alluxio and HDFS

April 16, 2019 By David Zhu

Alluxio provides a distributed data access layer for applications like Spark or Presto to access different underlying file system (or UFS) through a single API in a unified file system namespace. If users only interact with the files in the UFS through Alluxio, since Alluxio has knowledge of any changes the client makes to the UFS, it will keep Alluxio namespace in sync with the UFS namespace.

How To Speed Up Alluxio Metadata Operations Up To 100X

October 16, 2018 By David Zhu

This blog describes our experience in speeding up Alluxio metadata operations using fingerprint and Alluxio under store bulk operations. These latest optimizations can be found in the 1.8.1 release.
One of the major values Alluxio provides is a simple and unified interface to manage files and directories on different underlying storage systems. Alluxio acts as an intermediate layer and exposes a file interface for applications to interact with, even though the underlying storage system might be an object store that has a different interface.

David Zhu

Software Engineer, Alluxio