Announcing Alluxio Data Orchestration Hub

November 3, 2020

Adit Madan

We’re pleased to announce the general availability of Alluxio Data Orchestration Hub, your single pane of glass to orchestrate data for analytics and AI. The data ecosystem is complex with the separation of storage and compute across data centers and cloud providers. With this release we’ve made great strides towards simplifying data access and management across multiple environments.

Data Orchestration Hub, or the Hub, is a management console that makes it easy to manage an analytics cluster and connect it with multiple data sources to unify data lakes. The service provides an easy to use unified management view for configuration and monitoring, and wizard based curation of deployment workflows.

Connect Your Data Sources: Connect Alluxio to data storage and catalogs across multiple clouds, single cloud or on-premises using guided wizards.
Monitor Your Alluxio Cluster: Monitor your Alluxio cluster.
Manage Configuration: Set and distribute configuration for a cluster.

Alluxio Data Orchestration Hub is available immediately for all Alluxio deployment scenarios with compute engines like Presto, Spark and Tensorflow. The Hub is ready to use out of the box with Amazon EMR and Google Dataproc. Other platforms are also available for use. Please visit the documentation here for more information to try out the Hub.

When to Use

Connecting to data sources across regions

The Hub provides self-guided wizards to allow users to connect to data sources and catalogs in the same or remote data centers. A user is guided through the required configuration steps along with validation of the connection.

These wizards are applicable for multiple scenarios including: hybrid cloud, cross-data center, single cloud or private data center deployments. Manage your compute clusters with Alluxio using these easy-to-use wizards.

*Connect Alluxio to all your data sources across multiple clouds, single cloud or on-premises using self-guided wizards.*

Managing an Alluxio cluster

The Hub can be used to view a dashboard to monitor the state of processes on the cluster, as well as update configuration and restart processes. This is especially useful for cloud deployments without access to SSH for configuration and process management.

*Monitor the status of an Alluxio cluster anywhere. You can start or stop cluster components from an intuitive UI.*

What’s Next

To start using Alluxio Data Orchestration Hub, simply launch Alluxio enabled clusters in your on-premises or cloud deployment. Further changes and monitoring of the cluster is managed can now be managed using the Hub:

Process Management: Monitor status of each process part of the Alluxio cluster, and start / stop processes.
Connect Data Storage: Connect Alluxio to your data sources, such as HDFS / S3 / GCS, across a hybrid cloud, single cloud or on-premises.
Connect Data Catalog: Configure structured data catalogs for OLAP engines like Presto on Alluxio. Connect to existing catalog definitions to prevent re-definition of table metadata.
Advanced Configuration: Customize your Alluxio cluster with advanced options for setting and distributing configuration from the central console.

If you would like more information on Data Orchestration Hub and the supported toolset please read the release notes.

Have questions? Come join the Community Slack Channel.

Read the Alluxio 2.4 release product blog to learn more about the expanded features and capabilities to advance analytics and AI in the cloud.

Share this post

Blog

20x Faster Training Data Reads with Alluxio and Ray on Anyscale: A Cross-Region Benchmark

Alluxio and Anyscale benchmark achieves 20x faster cross-region data reads for AI training workloads on GCS.

Alluxio AI 3.9 Brings Checkpoint Acceleration to Any AI Training Framework

Alluxio AI 3.9 introduces POSIX Write Cache, eliminating the checkpoint write bottleneck in distributed training with 7.6 GiB/s per node throughput and sub-2ms P99 latency. Get all of the details here!

Alluxio AI 3.8: Two New Breakthrough Features for Faster Object Storage Writes and Faster Model Loading

Learn about the new features in Alluxio AI 3.8 designed to eliminate two of the most painful bottlenecks in modern AI pipelines. Introducing Alluxio S3 Write Cache, which dramatically reduces object store write latency and improves write-heavy workload performance, and Safetensors Model Loading Acceleration that delivers near-local NVMe throughput for model weight loading

‍

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo