Products
Two Ways to Keep Files in Sync Between Alluxio and HDFS
April 16, 2019
Alluxio provides a distributed data access layer for applications like Spark or Presto to access different underlying file system (or UFS) through a single API in a unified file system namespace. If users only interact with the files in the UFS through Alluxio, since Alluxio has knowledge of any changes the client makes to the UFS, it will keep Alluxio namespace in sync with the UFS namespace (see the left figure below).
However, where a file in the UFS is changed without going through Alluxio, the UFS namespace and the Alluxio namespace can potentially get out of sync. When this happens, a UFS Metadata Sync operation is required to synchronize the two namespaces (illustrated in the right figure).

Sync On-demand
Alluxio automatically caches metadata information from the UFS so that subsequent metadata operations such as listStatus (or ls) will not need to access the UFS. This reduces the latency of these metadata operations. However, sometimes the metadata of the underlying UFS can change without notifying Alluxio. When that happens, this cache needs to be invalidated.
Since version 1.7.0, Alluxio has provided an option alluxio.user.file.metadata.sync.interval which allows users to control how often this metadata cache gets refreshed. Anytime the client issues a metadata operation such as listStatus, it can specify the interval to be one of -1, 0 or a time value. When it is set to -1, Alluxio never fetches metadata information from the UFS. When it is set to 0, it always fetches metadata information from the UFS. When it is set to a time value, it will fetch the metadata information from the UFS if it has not done so in the recent past specified by the time value.
Here is an example.
$ alluxio fs ls -R -Dalluxio.user.file.metadata.sync.interval=0 /dir
This tells alluxio to always fetch the metadata information from the UFS.
One thing to note is that the Alluxio system never synchronizes with the UFS unless there is a client request to that UFS. This can cause problems because the first time a particular client accesses the UFS, the extra cost of accessing the UFS can cause a slowdown of the client request. This calls for a mechanism that will synchronize the Alluxio namespace and the UFS namespace in the background, or Active UFS sync.
Sync Proactively
Alluxio 2.0 preview release supports a new feature “Active UFS Sync”. It allows the users to specify a directory to be synchronized between Alluxio namespace and the UFS namespace, at a regular interval with a number of parameters to fine-tune that syncing behavior. Currently, Active UFS Sync is only supported between Alluxio and HDFS 2.7 or later. To use this feature, the user running Alluxio must be an HDFS admin user, in order to listen to the event stream HDFS provides.
To enable active sync on a directory, issue the following Alluxio command on a directory that is backed by HDFS.
$ alluxio fs startSync /syncedDir
You can also stop active sync on a directory by using the following command.
$ alluxio fs stopSync /syncedDir
Note the list of directories under active sync is remembered between master restarts. You can check which directories are under active sync by using the getSyncPathList command.
$ alluxio fs getSyncPathList
Optimizations
There are a few parameters to optimize the active UFS sync behavior.
Sync interval: Users can control the active sync interval by changing the alluxio.master.activesync.interval option, the default is 30 seconds.
Quiet period: To avoid syncing when the directory to be synced is under heavy modifications and adding more RPC workload to the UFS, active UFS Sync tries to only sync when the UFS is considered to be in a quiet period.
This quiet period is controlled by alluxio.master.activesync.maxactivity. Activity is a heuristic based on the exponential moving average of a number of events in a directory. For example, if a directory had 100, 10, 1 event in the past three intervals. Its activity would be 100/10*10 + 10/10 + 1 = 3. Property alluxio.master.activesync.maxactivity is the maximum number of activities in the UFS directory to be considered “quiet”. However, if we only sync during the quiet period, we may have to wait a long time and metadata can become stale in the Alluxio namespace. Property alluxio.master.activesync.maxage is the maximum number of intervals we will wait before synchronizing the UFS and the Alluxio space. The system guarantees that we will start syncing a directory if it is "quiet", or it has not been synced for a long period (a period longer than the max age).
Conclusion
When using Alluxio, it is important to keep the Alluxio namespace and the UFS namespace consistent. This article describes two ways to perform this synchronization. The synchronization can happen with a client call to Alluxio (On-demand) or happen in the background (Active UFS Sync), each with its own unique advantages. On-demand UFS metadata sync happens only when a client calls Alluxio, therefore it allows administrators to precisely control when sync happens. Active UFS Sync happens in the background, hence it requires minimal configuration and management. Administrators can choose the right strategy based on the specific use case.
.png)
Blog

Alluxio and Oracle Cloud Infrastructure: Delivering Sub-Millisecond Latency for AI Workloads
Oracle Cloud Infrastructure has published a technical solution blog demonstrating how Alluxio on Oracle Cloud Infrastructure (OCI) delivers exceptional performance for AI and machine learning workloads, achieving sub-millisecond average latency, near-linear scalability, and over 90% GPU utilization across 350 accelerators.

Make Multi-GPU Cloud AI a Reality
If you’re building large-scale AI, you’re already multi-cloud by choice (to avoid lock-in) or by necessity (to access scarce GPU capacity). Teams frequently chase capacity bursts, “we need 1,000 GPUs for eight weeks,” across whichever regions or providers can deliver. What slows you down isn’t GPUs, it’s data. Simply accessing the data needed to train, deploy, and serve AI models at the speed and scale required – wherever AI workloads and GPUs are deployed – is in fact not simple at all. In this article, learn how Alluxio brings Simplicity, Speed, and Scale to Multi-GPU Cloud deployments.
