Interactive analytics with Presto and Alluxio

Presto with Alluxio is a truly separated compute and storage stack, enabling interactive big data analytics on any file or object store.

Request a Demo

Alluxio provides a distributed caching layer that can be used between Presto and data sources to improve I/O performance. By caching data closer to the Presto workers, Alluxio reduces the latency of data access and alleviates pressure on the underlying storage system

Caching solutions in Presto

Type of Cache

Cache Location

Storage Media

When to Use

Built-in Presto chache

Metastore Cache

Slowplanning time
Slow Hive Metascore
Large tables with hundreds of partitions

List File Cache

Overloaded HDFS's namenode
Overloaded object storage such as S3

Allexio SDK Cache

Slow or unstable external storage

Built-in Presto chache

Allexio SDK Cache

Cross-region, multi-cloud, hybrid-cloud
Data sharing with other compute engines

Why Presto + Alluxio

Enhanced Query Performance

Alluxio provides a multi-tiered layer for Presto caching to reduce I/O access latency while co-located with Presto, enabling consistent high performance with jobs that run up to 10x faster.

Reduced Infrastructure Costs with Zero Data Copies

Alluxio makes the important data local to Presto, so there are no copies to manage when reading from remote data storage systems like s3, resulting in lower egress and API requests charges.

Unified Data Access

Alluxio connects to a variety of storage systems and clouds so Presto can query data stored anywhere, accelerating queries when reading remote data across datacenters, regions, and clouds.