This article was initially posted on InfoWorld. Understand the caching mechanisms for the popular distributed SQL engine and how to use them to improve query speed and efficiency. Presto is a popular, open source, distributed SQL engine that enables organizations to run interactive analytic queries on multiple data sources at a large scale. Caching is a typical optimization … Continued
Alluxio is commonly used with Presto and Hive to accelerate queries. Understanding how Presto+Hive+Alluxio work together and the flow from SQL query to low-level file system operations is key to tuning performance. This post will dive into the relationship between Presto, Hive, and Alluxio. We will walk you through how a SQL query executes in … Continued
Shopee is the leading e-commerce platform in SouthEast Asia. In this presentation, Luo Li from Shopee will share their Data Infra team’s recent project on acceleration with Presto and storage servitization. He will share the details on how Shopee leverages Alluxio to accelerate Presto query and provide standardized methods of accessing data through Alluxio-Fuse and Alluxio-S3.
With the advent of the Big Data era, it is usually computationally expensive to calculate the resource usages of a SQL query. Can we estimate the resource usages of SQL queries more efficiently without any computation in a SQL engine kernel? In this session, Chunxu and Beinan would like to introduce how Twitter’s data platform leverages a machine learning-based approach in Presto and BigQuery to estimate query utilization with 90%+ accuracy.
This blog was originally published in the Presto blog: https://prestodb.io/blog/2022/01/28/avoid-data-silos-in-presto-in-meta Alluxio: Rongrong Zhong Meta: James Sun, Ke Wang Raptor is a Presto connector (presto-raptor) that is used to power some critical interactive query workloads in Meta (previously Facebook). Though referred to in the ICDE 2019 paper Presto: SQL on Everything, it remains somewhat mysterious to many Presto users … Continued
In the previous blog, we introduced Uber’s Presto use cases and how we collaborated to implement Alluxio local cache to overcome different challenges in accelerating Presto queries. The second part discusses the improvements to the local cache metadata.
This article shares how Uber and Alluxio collaborated to design and implement Presto local cache to reduce HDFS latency.
Shopee is the leading e-commerce platform in SouthEast Asia. In this presentation, Tianbao Ding and Haoning Sun from Shopee will share their Data Infra team’s recent project on acceleration with Presto and storage servitization. They will share the details on how Shopee leverages Alluxio to accelerate Presto query and provide standardized method of accessing data through Alluxio-Fuse and Alluxio-S3.
With the collaboration between Meta (Facebook), Princeton University, and Alluxio, we have developed “Shadow Cache” – a lightweight Alluxio component to track the working set size and infinite cache hit ratio. Shadow cache can keep track of the working set size over the past window dynamically and is implemented by a series of bloom filters. Shadow cache is deployed in Meta (Facebook) Presto and is being leveraged to understand the system bottleneck and help with routing design decisions.