Alluxio 2.9.0 Release

We are thrilled to announce the release of Alluxio 2.9.0! This is the first release on the Alluxio 2.9.X line. This release introduces a feature for fine-grained caching of data, metrics for monitoring the master process health, changes to the default configuration to better handle master failover and journal backups. Multiple improvements and fixes were also made for the S3 API, helm charts, and POSIX API.

Highlights

Paging storage on Workers

The Alluxio workers support fine-grained page-level caching, typically at the 1 MB size, as an alternative to the existing block-based tiered caching, which defaults to 64 MB. Through this feature, caching performance will be improved by reducing amplification of data read by applications. See the documentation for more details.

Master monitoring metrics

The Alluxio master periodically checks its resource usage, including CPU and memory usage, and several internal data structures that are performance critical. By inspecting snapshots of resource utilization metrics, the state of the system can be inferred, which can be retrieved by inspecting the master.system.status metric. The possible statuses are:

  • IDLE
  • ACTIVE
  • STRESSED
  • OVERLOADED

The monitoring indicators describe the system status in a heuristic way to have a basic understanding of its load. See the documentation for more information about monitoring.

Journal and failover stability

The default configuration as of 2.9.0 skips the block integrity check upon master startup and failover (a493b69e2d). This speeds up failovers considerably to minimize system downtime during master leadership transfers. Instead, block integrity checks will be performed in the background periodically so as not to interfere with normal master operations.

Another default configuration change will delegate the journal backup operation to a standby master (e3ed7b674f) so as to not block the leading master’s operations for an extended period of time. Use the --allow-leader flag to allow the leading master to also take a backup or force the leader to take a backup with the --bypass-delegation flag. See the documentation for additional information about backup delegation.

Improvements and fixes since 2.8.1

Metadata and journal

  • Add cli for marking a path as needing sync with UFS (1c781f7de1)
  • Make metadata sync work with merge inode journal feature flag (7ff9df2789)
  • Make journal context thread safe (f5b2a5f438)
  • Fix error in snapshot-taking when using large group ids (2803dc4603)
  • Mark root as needing sync on backup restore (0fe867ac72)
  • Make MountTable.State thread safe (45ce753499)
  • Avoid canceling duplicate metadata sync prefetch job (aacee53fbc)
  • Support multithread checkpointing with compression/decompression (ae065e34b9)
  • Avoid and dedup concurrent metadata sync (025ca19d09)
  • Refine add mPendingPaths in InodeSyncStream new type (749f70cd1a)
  • Fix single master embedded journal checkpoint (a115fa3d5b)
  • Create inode before updating the MountTable (56d1c6a9bc)
  • Allow root sync path to use child sync time (4d6bce3d5f)
  • Add client operation for partial listing (bc6e63f7f8)
  • Determine the primary master address by calling GetNodeState endpoint (c63ec951e4)
  • Upgrade Apache Ratis from 2.0.0 to 2.3.0 (dc0a21daed)
  • Merge journals & flush journals before lock release (8dafc272be)
  • Add Partial listing of files in listStatus (5f50dd8ab3)
  • Improve error handling and naming on journal threads (e69adee025)
  • Add Partial listing of files in listStatus (ec0ce2a656)
  • Fix journal shutdown deadlock (155a370fbe)
  • Allow writing to read-only file when creating (fe27139f6c)
  • Avoid checking file permissions in getFileInfo method (54494af052)
  • Fix master down when master change to leader (1ada2ac8c7)

Cache and storage

  • Implement Byte array pool (d071d5ef7c)
  • Add block size for paged blocks (fd865e24a1)
  • Make worker init tiers parallel (2af80f6e19)
  • Fix early release of buffer (5e8d62c8a7)
  • Separate page store configuration from client cache (f6cce53631)
  • Optimize getFileBlockLocations performance (e5692d2261)
  • Ignore parent path NoSuchFileException when localPageStore delete pageId (6a64a178b2)
  • Support size encoding for clock cuckoo filter in shadowcache (7395bed318)
  • Do not add the worker to failed list for client exception (951f3568a2)
  • Fix the leak of block lock (b97677e78e)
  • Fix failures in mem page store when zero copy enabled (cbc2008b19)
  • Fix load sessionId (d763b4077e)
  • Fix paged block reader transfer offset (463506e178)
  • Implement locking and pinning for paged block store (9c67f34682)
  • Allocate buffer in load api (b93535cb6a)
  • Fix worker stream register forget release lease (2df21b9aee)
  • Fix paged block store tier name (92c0a4fb73)
  • Fix potential deadlock in tier store and refine the code (d1efe52f44)
  • Fix MaxFreeAllocator.allocateBlock (21be94c923)
  • Disable passive cache for pinned files (0e53f84b7c)
  • Fix out of bound read in PagedBlockReader (63318bb3bb)

S3 API and proxy

  • Fix out of bound error in parsing s3 Authorization header (5f95931723)
  • Implement S3 headbucket API (0236480f58)
  • Fix double checked locking in S3 uploader (cdbfa096b2)
  • Fix special char support (1ea5a840ac)
  • Fix aws s3 cp with source object having special characters in it (7eba438a6f)
  • Respect prefix param to avoid recursive ls on root dir (619089a7ea)
  • Fix S3 API file mode bits to prevent unauthorized reads (87602c7a71)
  • Add empty string check for delimiter (de0bebfeed)
  • Extract Authentication and common logging into the specific filters (e5f9b8d7a0)
  • Fix access control issues with S3 API metadata directory (7ab414ed06)
  • Update ListMultipartUploads to prevent leaking other users’ upload IDs (9364fd0190)
  • Require valid “Authorization” header in S3 API Proxy (2f3c42cf8e)
  • Fix S3 API writing objects yields BucketNotFound 404 (72f68208d4)
  • Add s3 rest service audit log (fdcba75c7e)

Kubernetes and docker

FUSE/POSIX API

  • Modify Libfuse version configuration (496e91069b)
  • Fix fuse mount options and refactor path cache loader (9cfba09aa0)
  • Make configuration source of truth in AlluxioFuse (3a26baeabc)
  • Fix alluxio-fuse unmount to get the right pid when no options (49022a5443)
  • Support setting sleep time for alluxio-fuse mount (ca8db364cf)
  • Avoid chown if the file already has correct owner and group (a4a85ee0bf)
  • Fix fuse check file name length method name (d3dee12ce4)

UFS

  • Add Support for Azure Data Lake Gen2 MSI (5dfa1789c6)
  • Add support to ofs schema name (cecaa37744)
  • Add configuration for kerberos authentication for Ufs HDFS (72f6763f13)
  • Delete temporary files when uploading files to OBS fails (65a8084709)
  • Fix ozone mount failure (4cc34dff2b)

CLI

  • Support ignore delete mount point directory by ttl action (2603b609ed)
  • Support strict version match option for mount (2ed18f54a3)
  • Support getMountTable without invoke ufs (86e2f8210f)
  • Support record audit log for getMountTable op (82aa6633ff)

Error and exception handling

  • Make worker error propagate to client (9cee334eb2)
  • Fix worker swallow OOM (aae5a02a5f)
  • Support failover worker while reading (5de0314361)
  • Filter exception that need to be retried in ObjectUnderFileSystem (d5ea085afc)
  • Force metadata sync when data read fails due to out-of-range error (badca18e3f)
  • Catch runtime exception in rpc (4b3fac5dd1)
  • Update worker exception (6982d6c759)

Metrics and monitoring

  • Fix gauges when creating a new rpc server (7a4e35240f)
  • Add an overloaded check according to the JVM pause time (d064cbff66)
  • Add direct mem used metrics (fea89c61e1)
  • Initialize AuditLog writer in WebServer for proxy (1c24dc3425)
  • Add some metrics of threads and docs of worker CacheManager threads (e07e17d510)

Stressbench and microbench

  • Fix misuse of variable in computeMaxThroughput (5e544aa421)
  • Add posix api to StressClientIOBench (f6345919a1)
  • Fix clientIO stressbench throughput calculation (7422dfb209)
  • Make clientIO a multi-node test (94f3703c7f)
  • Support multiple files random and sequential read in StressWorkerBench (8e1a25df2c)
  • Implement Alluxio POSIX API master stressbench test (b4af5969ae)
  • Add microbenchmarks for multiple implementations of BlockStore (5828d56a82)

Deprecations

  • Clean up ignored table unit/integration tests and maven (8875c66ed4)
  • Remove Ufs Extension (9fb093c10f)
  • Remove conf & doc for tiered locality (11c1c7c5bf)
  • Remove Configuration and CLI of Alluxio table (c0571e72a7)

Miscellaneous

  • Add LANG to alluxio-env.sh (e842df2719)
  • Allow dot in chown username or group (9b04c8040f)
  • Support Long type config values (d61aee0992)
  • Ensure the HadoopFS default port is the same across all hadoop fs (25e4301a7b)
  • Bump up maven frontend plugin for m1 arm support (b69bf4df59)

Acknowledgements

Thanks to the community for their invaluable contributions to the Alluxio 2.9 release. It would not have been possible without your help! We would especially like to thank:
Bob Bai (bobbai00), Haoning Sun (Haoning-Sun), Jie Fu (DamonFool), Li Simian (LDawns), LiuJiahao0001, Shuai Wuyue (shuaiwuyue), XiChen (xichen01), Xinli Shang (shangxinli), XuanlinGuan, adol001, bigxiaochu, dangxiaodong (smdxdxd), Tianbao Ding (flaming-archer), Baolong Mao (maobaolong), Lei Qian (qian0817), Zhaoqun Deng (secfree), Yanbin Zhang (singer-bin), Xinyu Deng (voddle), Yangchen Ye (YangchenYe323), and Zhigang Huang (zerorclover)

Enjoy the new release and let us know what you think on our community slack channel.