Running Presto with Alluxio on Amazon EMR

Community Online Office Hour *

Many organizations are leveraging EMR to run big data analytics on public cloud. However, reading and writing data to S3 directly can result in slow and inconsistent performance. Alluxio is a data orchestration layer for the cloud, and in this use case it caches data for S3, ensuring high and predictable performance as well as reduced network traffic.

AWS S3 + Alluxio + Presto = ❤️ The Ryte Use Case

Alluxio Open Source Online Meetup *

In this presentation, Ryte’s Chapter lead engineer, Danny Linden, shows why & how we solve some challenging technical issues, improve the speed, and reduce costs of our AWS EMR Hadoop & Presto -Backend with Alluxio to an awesome level!

Effective Analytical Pipelines on AWS Using EMR, Alluxio, and S3

This article describes my lessons from a previous project which moved a data pipeline originally running on a Hadoop cluster managed by my team, to AWS using EMR and S3. The goal was to leverage the elasticity of EMR to offload the operational work, as well as make S3 a data lake where different teams can easily share data across projects.

Tech Talk: Accelerating analytics with EMR on your S3 data lake

EMR has become a widely used service to run big data analytics in the public cloud. But issues around slow/inconsistent EMR performance due to S3 data lakes creates challenges for organizations.

Alluxio is a data orchestration layer for the cloud that increases performance of analytic workloads running on AWS EMR using S3 as the storage. 

Join us for this webinar where we will show you how to set up EMR Spark and Hive with Alluxio so jobs can seamlessly read from and write to your S3 data lake. You’ll see the performance gains with Alluxio in your EMR/S3 stack.

Tags: , , , , ,