What’s Alluxio?

Open source data orchestration for analytics and machine learning in any cloud

COMMON PROBLEMS WE SOLVE

Inconsistent performance on S3

S3 performance for analytic workloads is inconsistent and data egress is expensive.

Cloud caching solution >

Limited compute capacity on-prem

Making HDFS or object store data accessible to any compute in any cloud is complex.

Zero-copy burst solution >

Slow on-prem object store

Object storage performance, particularly for metadata operations, is unpredictable.

Faster workloads on object store solution >

Alluxio can help

Locality

Accelerate analytics on the public cloud

Accessibility

Zero-copy workload bursting to the cloud

Elasticity

Speed up your cloud/on-prem object store

alluxio in action

Alluxio at Development Bank of Singapore

Featured use cases at DBS Bank include “zero-copy” bursting for on-prem compute and object store analytics acceleration.

Meet Alluxio

Alluxio enables data orchestration for compute in any cloud. It unifies data silos on-premise and across any cloud to give you the data locality, accessibility, and elasticity needed to reduce the complexities associated with orchestrating data for today’s big data and AI/ML workloads.

Scalable to over a billion files in a single cluster, Alluxio’s distributed architecture is built on three core components:

  • Alluxio Master, which manages file and object metadata
  • Alluxio Worker, which manages the node’s local space, as well as manages file and object blocks and interfaces with the storage systems underneath
  • Alluxio Client, which allows analytics and AI/ML applications to interface with Alluxio

Key Technical Features

Compute-focused

Support for hyperscale workloads
Supports a billion files and thousands of workers and clients, all with high-availability.

Flexible APIs
Integrates your compute frameworks like Spark, Presto, Tensorflow, Hive and more out-of-the-box using the HDFS, S3, Java, RESTful, or POSIX-based APIs.

Intelligent data caching and tiering
Automatically utilizes near-compute storage media for optimal data placement based on data topology and workload.

Storage-focused

Built-in data policies
Provides highly customizable data policies for persistence, cross storage data migration, and distributed load.

Plug and play under stores
Integrates your under store systems like HDFS, S3, Azure Blob Store, Google Cloud Store and more through a range of interfaces.

Transparent unified namespace for file system and object stores
Mounts multiple storage systems into a single consolidated namespace for both read and write workloads.

Enterprise-ready

Security
Provides data protection on the wire and in the cloud with built-in auditing, role-based access control, LDAP, active directory, and encrypted communications.

Monitoring and management
Provides a user-friendly web interface and command line tools, allowing users to monitor and manage their cluster.

Enterprise high availability with tiered locality
Includes adaptive replication across regions and zones to maximize performance and availability.

alluxio benchmarks

Alluxio at Ryte

Alluxio helped Ryte decouple S3 latency spikes from user requests without the need for additional hardware. With Presto + Alluxio, Ryte saw an average of 4x improvement in performance.

Learn more >

Get Started with Alluxio

Alluxio offers a free Community Edition and an Enterprise Edition