AWS S3 + Alluxio + Presto = ❤️ The Ryte Use Case

Alluxio Open Source Online Meetup *

In this presentation, Ryte’s Chapter lead engineer, Danny Linden, shows why & how we solve some challenging technical issues, improve the speed, and reduce costs of our AWS EMR Hadoop & Presto -Backend with Alluxio to an awesome level!

Online Meetup: Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3

In this online presentation, we present how ING is leveraging Presto (interactive query), Alluxio (data orchestration & acceleration), S3 (massive storage), and DC/OS (container orchestration) to build and operate our modern Security Analytics & Machine Learning platform. We will share the challenges we encountered and how we solved them.

Tags: , , , ,

Tech Talk: Accelerating Spark with Kubernetes

Kubernetes is widely used across enterprises to orchestrate computation. And while Kubernetes helps improve flexibility and portability for computation in public/hybrid cloud environments across infrastructure providers, running data-intensive workloads can be challenging.

When it comes to efficiently moving data closer to Spark or Presto frameworks, co-locating data with these frameworks and accessing data from multiple or remote clouds is hard to do. That’s where Alluxio, an open source data orchestration platform, can help.

Alluxio enables data locality with your Spark and Presto workloads for faster performance and better data accessibility in Kubernetes. It also provides portability across storage providers.

In this on demand tech talk we’ll give a quick overview of Alluxio and the use cases it powers for Spark/Presto in Kubernetes. We’ll show you how to set up Alluxio and Spark/Presto to run in Kubernetes as well.

Tags: , , , , , ,

AWS + Alluxio: Data Orchestration for Analytics & AI in the cloud

Many organizations have taken advantage of the scalability and cost-savings of cloud computing as well as cloud storage services to meet their data-powered workload demands. In addition, as data is increasingly siloed and lives everywhere, there’s a need for data orchestration to bring the needed data closer to compute. With Alluxio’s data orchestration platform, bring back data locality for your compute with in-memory & tiered data access.

Tags: , , , , , , ,

Getting Started with the Alluxio-Presto Sandbox

The Alluxio-Presto sandbox is a docker application featuring installations of MySQL, Hadoop, Hive, Presto, and Alluxio. The sandbox lets you easily dive into an interactive environment where you can explore Alluxio, run queries with Presto, and see the performance benefits of using Alluxio in a big data software stack.

Which kind of EC2 instance is more recommended for use with Alluxio with applications like Presto/Spark? Does it make a big difference to have EBS disks with IOPS?

Presto and Spark are CPU-bound so they require CPU intensive instances. But on the other hand, they also need memory so the R4/R5 instances are what most users end up using for their Presto/Spark workloads. The memory itself will get distributed across Presto/Spark and Alluxio, and typically we see about 60% going to compute, 30% … Continued