ALLUXIo FOR DATA ENGINEERS

How to solve common data engineering problems in the cloud

PROBLEM

ALLUXIO SOLUTION


  • Presto/Spark/Hive queries slow on S3 object stores
  • Inconsistent performance or missed SLAs
  • Metadata operations slow on S3
  • The only intelligent, multi-tiered cache available for your app frameworks
  • Consistent high performance
  • 1.5x – 10x performance increases on object stores
  • Performance like HDFS in Cloud
  • Leverages RAM, SSD, and HDD

  • Cloud egress charges too high
  • As the only relevant, active data sets are moved, egress charges can be reduced by up to 80%

Common Use Case: How Data Engineers Are Using Alluxio Today

Hybrid Cloud Analytics

Simplify Hadoop for the hybrid cloud by making on-prem HDFS accessible to any compute in the cloud.

Cloud Analytics Caching

Get in-memory data access for Spark and Presto
on AWS S3, AWS EMRFS, Google Cloud Platform, or Microsoft Azure

Watch the on-demand tech talk
Accelerate and Scale Big Data Analytics and ML Pipelines with Disaggregated Compute and Storage

Read the whitepaper
Hybrid Cloud Analytics: Scaling analytics workloads on on-prem to public clouds with Alluxio