Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Proven at scale in a variety of use cases at Comcast, GrubHub, FINRA, LinkedIn, Lyft, Netflix, Slack, Zalando, in the last few years Presto experienced an unprecedented growth in popularity in both on-premises and cloud deployments over Object Stores, HDFS, NoSQL and RDBMS data stores.
Alluxio and Presto are a powerful combination to address the compute problem, which is part of the strategy used by Simbiose Ventures to create a product called StorageQuery – A platform to query files in cloud storages with SQL.
Over the last few years, organizations have worked towards the separation of storage and compute for a number of benefits in the areas of cost, data duplication and data latency. Cloud resolves most of these issues but comes to the expense of needing a way to query data on remote storages. Alluxio and Presto are a powerful combination to address the compute problem, which is part of the strategy used by Simbiose Ventures to create a product called StorageQuery – A platform to query files in cloud storages with SQL.
Are you using SQL engines, such as Presto, to query existing Hive data warehouse and experiencing challenges including overloaded Hive Metastore with slow and unpredictable access, unoptimized data formats and layouts such as too many small files, or lack of influence over the existing Hive system and other Hive applications?
This article goes through a simple example to illustrate how Structured Data Management available in the latest Alluxio 2.2.0 release to help SQL and structured data workloads.
This article introduces Structured Data Management available in the latest Alluxio 2.2.0 release, a new effort to provide further benefits to SQL and structured data workloads using Alluxio.
In this office hour, we will go over an introduction and motivation of Alluxio Structured Data Management, an overview of the different services in Alluxio 2.1, and a demo using Alluxio Structured Data Management with Presto.
This talk describes a stack of open-source projects to serve high-concurrent and low-latency SQL queries using Presto with Alluxio on big data in the cloud. Deploying Alluxio as a data orchestration layer to access cloud storage object storage (e.g., AWS S3), this architecture greatly enhances the data locality of Presto with distributed and cross-query caching, thus avoids reading the same data repeatedly from the cloud storage.
The Presto Summit continues to bring together the best developers, engineers, data scientists, and executives from the Presto community to share how some of the largest and most innovative companies are using this technology to power their analytics platforms.