alluxio engineering Archives | Page 3 of 9

Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration Between Presto & Alluxio

Alluxio Global Online Meetup * May 7, 2020

For many latency-sensitive SQL workloads, Presto is often bound by retrieving distant data. In this talk, Rohit Jain, James Sun from Facebook and Bin Fan from Alluxio will introduce their teams’ collaboration on adding a local on-SSD Alluxio cache inside Presto workers at Facebook to improve queries with unsatisfied latency.

Optimizing Query Performance by Decoupling Presto and Hive Data Warehouse

March 24, 2020

Ideally, Presto would access data independently from how the data was originally stored or managed. Alluxio, as a data orchestration layer provides the physical data independence, for Presto to interact with the data more efficiently. In addition to caching for IO acceleration, Alluxio also provides a catalog service to abstract the metadata in the Hive Metastore, and transformations to expose the data in compute-optimized way. In this talk, we describe some of the challenges of using Presto with Hive, and introduce Alluxio data orchestration for solving those challenges.

Tags: alluxio engineering, catalog service, data orchestration, hive, office hour, performance, presto, structured data services

Tags: alluxio engineering, aws s3, compute storage separation, hdfs, hive, office hour, spark

Tag: alluxio engineering

Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration Between Presto & Alluxio

Optimizing Query Performance by Decoupling Presto and Hive Data Warehouse

Serving Structured Data in Alluxio: Example

Serving Structured Data in Alluxio: Concept

What’s new in Alluxio 2.2

Getting Started with EMR Hive on Alluxio in 10 Minutes

Community Office Hour: Accelerating Hive with Alluxio on S3