What’s Alluxio?

Open source data orchestration for analytics and machine learning in any cloud

Data orchestration challenges for today’s data engineer

Analytics or ML in the cloud too slow?

S3 performance for analytics and ML workloads is inconsistent and data egress is expensive.

Hybrid cloud for data too hard to implement?

Making HDFS or object store data accessible to any compute in any cloud is complex.

Want to use object stores for big data workloads?

Object storage performance, particularly for metadata operations, is unpredictable.

Multi-cloud data access too complex?

Orchestrating data from multiple public clouds for big data workloads is complex and expensive.

Alluxio can help

LOCALity

Accelerate big data frameworks on the public cloud

Get in-memory data access for Spark and Presto for any cloud – AWS, Google Cloud Platform, or Microsoft Azure.

ACCESSIBLE

Run big data workloads in hybrid cloud environments

No matter where it sits – on-prem, in the cloud, or in HDFS – your data is accessible in many different ways.

ELASTIC

Bring big data and AI workloads to any object store

Accelerate your Spark, Presto, and Tensorflow workloads for any object store, in any cloud.

Meet Alluxio

Alluxio enables data orchestration for compute in any cloud. It unifies data silos on-premise and across any cloud to give you the data locality, accessibility, and elasticity needed to reduce the complexities associated with orchestrating data for today’s big data and AI/ML workloads.

Scalable to over a billion files in a single cluster, Alluxio’s distributed architecture is built on three core components:

  • Alluxio Master, which manages file and object metadata
  • Alluxio Worker, which manages the node’s local space, as well as manages file and object blocks and interfaces with the storage systems underneath
  • Alluxio Client, which allows analytics and AI/ML applications to interface with Alluxio

Use Cases and Deployments

Single cloud caching

Get in-memory data access for Spark and Presto on AWS S3, Google Cloud Platform, or Microsoft Azure.

Hadoop simplified for the hybrid cloud

Simplify Hadoop for the hybrid cloud by making on-prem HDFS accessible to any compute in the cloud.

Accelerated workloads for object stores

Accelerate your Spark, Presto, and Tensorflow workloads for object stores on-premise or in the cloud.

Analytics and AI/ML workloads accelerated on any object store

Accelerate your Spark, Presto and Tensorflow workloads for both on-prem and cloud object stores.

Data orchestration across
multiple clouds

Logically unify your geo-distributed data from different clusters, data centers, regions, and countries.

Key Technical Features

Compute-focused

Support for hyperscale workloads
Supports a billion files and thousands of workers and clients, all with high-availability.

Flexible APIs
Integrates your compute frameworks like Spark, Presto, Tensorflow, Hive and more out-of-the-box using the HDFS, S3, Java, RESTful, or POSIX-based APIs.

Intelligent data caching and tiering
Automatically utilizes near-compute storage media for optimal data placement based on data topology and workload.

Storage-focused

Built-in data policies
Provides highly customizable data policies for persistence, cross storage data migration, and distributed load.

Plug and play under stores
Integrates your under store systems like HDFS, S3, Azure Blob Store, Google Cloud Store and more through a range of interfaces.

Transparent unified namespace for file system and object stores
Mounts multiple storage systems into a single consolidated namespace for both read and write workloads.

Enterprise-ready

Security
Provides data protection on the wire and in the cloud with built-in auditing, role-based access control, LDAP, active directory, and encrypted communications.

Monitoring and management
Provides a user-friendly web interface and command line tools, allowing users to monitor and manage their cluster.

Enterprise high availability with tiered locality
Includes adaptive replication across regions and zones to maximize performance and availability.

Get Started with Alluxio

Alluxio offers a free Community Edition and an Enterprise Edition