Alluxio 2.4.0 Release

We are excited to announce the release of Alluxio 2.4.0! This is the first release on the Alluxio 2.4.X line. It contains a variety of feature enhancements, bug fixes, and performance improvements over the existing Alluxio 2.3.X line.

Organizations leverage Alluxio at enormous scale, both in data size and number of files. With this release, users will be able to manage namespaces with billions of files while breaking away from dependencies on traditional Hadoop components. We have further bolstered support for cloud native and container based deployments with simplified system monitoring, DevOps tooling, and integrations for secure credential management.  

Highlights

Embedded Journal Scalability

Alluxio 2.4 allows users to manage namespaces with billions of files with the embedded journal.

The embedded journal is a fully contained, distributed state machine which uses the RAFT consensus algorithm. This allows users to have a fault tolerant and performant medium for journal storage, independent of 3rd party storage systems and Zookeeper. Users running only on object stores will be able to deploy Alluxio in high availability mode without incurring a large metadata performance penalty or relying on another distributed storage. See the docs for more details.

Note: the embedded journal format in 2.4 is not compatible with embedded journal format in lower versions. Please read here regarding how to upgrade the journals.

ADLS and Apache Ozone Under Filesystem Integrations

Alluxio 2.4 comes with under filesystem implementations for ADLS v1 and Apache Ozone.

Azure Data Lake Storage is a hyper scale, Hadoop compatible data lake on Microsoft Azure. Users can now connect Alluxio to ADLS with the ADLS v1 connector to enable compute to access data in ADLS. See the docs for more details.

Apache Ozone is a scalable, redundant, and distributed object store for Hadoop. Users can now connect Alluxio to Ozone with the Ozone connector to enable compute to access data in Ozone. See the docs for more details.

1-command Log Collection

Alluxio 2.4 improves the built in log collection process so administrators can gather all relevant cluster logs in a single file with one command.

The standardized log collection command greatly simplifies the administrator’s life when submitting logs for troubleshooting. See the docs for more details.

Cluster Performance Metrics

Alluxio 2.4 allows users to track key cluster performance metrics in the web interface (<AlluxioMaster>/metrics) and/or programmatically.

Alluxio 2.4 includes timeseries for aggregated cluster throughput and cumulative API calls served per mount point. Cluster throughput is essential for determining the effectiveness of the Alluxio cache. API calls served is a strong metric for quantifying the latency and potential cost savings provided by Alluxio’s namespace virtualization.

Java 11 Support

Alluxio 2.4 client and server artifacts support Java 11 in addition to Java 8 environments.

Java 11 can improve the performance and stability of Big Data applications including Alluxio, Presto, and Spark. Applications relying on the Alluxio client library can be run in a Java 11 or Java 8 environment. The Alluxio servers can be run in Java 11 or Java 8 environments. See the docs for more details.

JVM Pause Monitoring

Alluxio 2.4 enables users to better monitor cluster health by logging a warning when a long JVM pause is detected.

JVM pauses can cause unexpected application slowdowns or failures. The root cause of a JVM pause can be from several sources such as extensive garbage collection or underlying machine instability. Being able to monitor and track JVM pauses experienced by the Alluxio servers greatly helps system administrators in troubleshooting and maintaining cluster health.

Other Improvements

  • Stabilize Alluxio delegation backup functionality (5e21e8d) (d1dc8a3
  • Improve error message and reduce spamminess (b8cca5b) (b53effe3) (7b6a3ca6) (0756e57) (5487eae) (3daff672) (ffcabeab) (0ffc29c3) (b666fdc
  • Add job failure history rest API endpoint (dde8c28) (21e3d51)
  • Support HDFS 2.8, 2.9, 3.2 versions by default for EMR and Dataproc (b0f62f05)
  • Change Alluxio default HDFS version to 3.3 instead of 2.7 (60148d9)
  • Add metadata operations saved metrics (00a4e7)
  • Support java related commands like jmap, jstack in Alluxio docker containers (af350de)
  • Support du -g command showing capacity information grouped by worker (1f00153)
  • Support dynamic users in Alluxio docker fuse container (56533bc)
  • Add Azure Data Lake Gen 1 UFS (384f944)
  • Add Alluxio kubernetes operator (cc90e04)
  • Add S3 connection TTL (01b3d5d)
  • Improve hdfs, hms, and environment validation tools (278b8f) (1f900b9)
  • Increase master serving timeout for slower machines (d3cedb0)
  • Support older hive metastores in Alluxio structured data service (575141ba)
  • Support Glue column statistics in structured data service  (f2cbaf7)
  • Add proxy support for Glue udb of structured data service (da45e63f)
  • Improve credentials protection (a892118)
  • Simplify UfsStatus handling in loadMetadata (f6f265c)
  • Support customized environment variables for helm chart (01d1a9ad)
  • Optimize UFS access on missing paths (ecbdbde)
  • Support customize mount point in docker fuse container (ce2bcb6)
  • Support RPC cancellation (0236618)
  • Improve worker streaming read/write tracking (0742ea0)

Bug Fixes

  • Fix and improve Alluxio Dataproc and EMR scripts (614824e4) (648090ce) (b346d89) (f7c2acfa) (91f9220)
  • Update sync status after processing the entire sync (e8b5f6d)
  • Fix cp command with wildcard and special characters in path (6ed5c14)
  • Auto close worker client when file read finished to avoid resources exhausted (53e9cee)
  • Avoid Active Sync manager connecting to UFS when becoming secondary mater (ded47c80)
  • Fix NullPointerException in worker tier promote task (763381a)
  • Fix AbstractWriteHandler abort (3ecf217)
  • Cancel in-progress checkpoints when thread is shutdown (d1b4893)
  • Pass options to DelegatingFileSystem (39556e9)
  • Fix java opts passing in docker and kubernetes environment (7a7ebd5) (9be8860)
  • Fix comma-separated medium types parsing in Helm chart (16daf4b3)
  • Fix block location iteration with rocksdb (0aff9389)
  • Fix Object UFS edge case (fcf0bbcd) (fec5421)
  • Resolve null values in structured data service with Glue (35e8a9)

Acknowledgment

We would like to thank all community members and organizations for their contributions to Alluxio 2.4.0. The release would not have been possible without your support! 

Especially, we would like to thank for Yang Che from Alibaba with tremendous contribution in improving Alluxio/Kubernetes integration, Chao Wang/Mickey Zhang from Microsoft in ADLs integration, Baoloing Mao from Tencent in OZone integration and Yili Luo from Nanjing University in various FUSE-related improvements.
Enjoy the new release and look forward to hearing your feedback (community slack channel)