starburst data AND alluxio
Starburst Data, the Presto company, plus Alluxio brings together two open source technologies in a bundled solution that provides users with exceptional performance and multi-cloud capabilities for interactive analytic workloads.
starburst + alluxio = better together
Starburst Presto with Alluxio is a truly separated compute and storage stack, enabling interactive big data analytics on any file or object store.
Alluxio provides a multi-tiered data caching system for Presto, enabling consistent high performance with jobs that run up to 10x faster
Alluxio makes the important data local to Presto, so there are no copies to manage (and lower costs)
Alluxio connects to a variety of storage systems and clouds so Presto can query data stored anywhere
“Presto has become the SQL engine of choice for enterprises across industries to seamlessly query data wherever it’s stored. Working with Alluxio, we are now able to offer customers a caching solution to further benefit from the power of disaggregated analytics that Starburst Presto provides.”
– Justin Borgman, co-founder and CEO of Starburst
starburst with alluxio deployment approaches
in the cloud
hybrid cloud / cross datacenter
Integrate on-prem data stores like HDFS with Alluxio and Starburst Presto and get high performance in your hybrid cloud environment. Burst Presto into the cloud on-demand, when you need it.
Plus, access data anywhere it’s located – across regions, sites, or datacenters, in HDFS or object stores – for high performance analytics anywhere.
Hybrid cloud analytics
Cross datacenter analytics
Want help getting started on Starburst Presto and Alluxio? Looking for feedback on your project’s architectural design?
Leading online retailer JD.com built an ad-hoc SQL query engine to support 400,000 jobs (15+ PB) daily, on a system with more than 15,000 cluster nodes and a total capacity of 210 PB. Two challenges they faced were around Presto workers reading remotely from HDFS datanodes and a large query variance. With Alluxio and Presto together, JD.com has seen a 10x performance improvement, including enhanced syncing for better consistency between Alluxio/Presto and HDFS.
See the slides >
Online gaming company Netease, the operator of popular titles like “World of Warcraft” and “Hearthstone”, needed a data platform to handle 30TB of raw data collected daily. That raw data is processed in ODS tables by ETL jobs which makes it an even larger amount of data. To support high performance ad hoc queries, they turned to Presto and Alluxio to speed up response time of queries for their massive datasets.
See the benchmarks >