presto Archives | Page 9 of 11

Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3

Alluxio Global Online Meetup * August 1, 2019

This event features leading financial services company ING Bank’s user story on how they leverage open source technologies like Presto and Alluxio with S3.

Getting Started with the Alluxio-Presto Sandbox

July 11, 2019 By Zac Blanco

The Alluxio-Presto sandbox is a docker application featuring installations of MySQL, Hadoop, Hive, Presto, and Alluxio. The sandbox lets you easily dive into an interactive environment where you can explore Alluxio, run queries with Presto, and see the performance benefits of using Alluxio in a big data software stack.

Which kind of EC2 instance is more recommended for use with Alluxio with applications like Presto/Spark? Does it make a big difference to have EBS disks with IOPS?

Presto and Spark are CPU-bound so they require CPU intensive instances. But on the other hand, they also need memory so the R4/R5 instances are what most users end up using for their Presto/Spark workloads. The memory itself will get distributed across Presto/Spark and Alluxio, and typically we see about 60% going to compute, 30% … Continued

Recap: Presto Summit SF 2019

July 1, 2019 By Amelia Wong

Alluxio is a proud sponsor and exhibitor at the Presto Summit in San Francisco.
What’s Presto Summit? It’s the leading Presto conference co-organized by our partner Starburst Data and the Presto Software Foundation.

“Zero-Copy” Hybrid Bursting with no App Changes

June 28, 2019

This whitepaper details how to leverage any public cloud (AWS, Google Cloud Platform, or Microsoft Azure) to scale analytics workloads directly on on-prem data without copying and synchronizing the data into the cloud. We will show an example of what it might look like to run on-demand Starburst Presto, Spark, and Hive with Alluxio in the public cloud using on-prem HDFS.

The paper also includes a real world case study on a leading hedge fund based in New York City, who deployed large clusters of Google Compute Engine VMs with Spark and Alluxio using on-prem HDFS as the underlying storage tier.

Tags: apache hive, apache spark, aws, case study, hybrid cloud, presto

Alluxio for Presto Datasheet

June 27, 2019

This datasheet introduces the Presto + Alluxio Solution. Alluxio enables caching for Presto as well as hybrid deployments.

Tags: presto, presto caching

Building Fast SQL Analytics with Presto, Alluxio, and S3

Alluxio Community Office Hour * July 30, 2019

Learn how to set up Presto with Alluxio such that Presto jobs can seamlessly read from and write to S3.
Compare the performance between Presto on S3 with Presto and Alluxio on S3.

Starburst Presto and Alluxio announce strategic OEM partnership

June 19, 2019 By Dipti Borkar

Announcing the OEM partnership with Alluxio and Starburst Data, the company behind Presto, the fastest growing SQL query engine in a disaggregated world.

Hadoop overload: Reduce large performance variance in HDFS namenode

Some people experience serious performance issue in HDFS namenode (v2.7) response time. Particularly during peak traffic time, an HDFS namenode can become overloaded and some DFS operations (like listing a directory) can take a long time, which affects the query response time for Presto and other Hadoop applications. To solve for challenges in high latency … Continued

Tag: presto