Two Sigma Open Source Meetup
Presto is an open source distributed SQL engine widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Alluxio is an open-source distributed file system that provides a unified data access layer at in-memory speed. The combination of Presto and Alluxio is getting more popular in many companies like JD, NetEase to leverage Alluxio … Continued
This presentation focuses on how Alluxio enables the big data analytics stack to be cloud-native. Today’s cloud object storage systems provide more cost-effective and scalable storage solutions but also different semantics and performance implications compared to HDFS. Applications like Spark or Presto will not benefit from the node-level locality or cross-job caching when retrieving data from the cloud object storage. Deploying Alluxio to access cloud solves these problems because data will be retrieved and cached in Alluxio instead of the underlying cloud or object storage repeatedly.
Author: Shuang Li (Shuang is a big data engineer at Netease Games, developing and maintaining OLAP related solutions in the data warehouse. He works closely on Apache Kylin and Presto as well as HBase. Shuang graduated from South China University of Technology.) Background As one of the world’s leading online game company, Netease Games is … Continued
Today’s enterprises are decoupling storage and compute as they migrate to the cloud, and that’s where Alluxio comes in. Alluxio is the data orchestration layer between storage and compute, bringing your data closer to your Presto workloads for better performance on top of S3.
See how Presto + Alluxio gives you the performance needed for your compute, regardless of where it is – in the cloud or on-premise.
On September 13th, we held our first New York City Alluxio Meetup! Work-Bench was very generous for hosting the Alluxio meetup in Manhattan. This was the first US Alluxio meetup outside of the Bay Area, so it was extremely exciting to get to meet Alluxio enthusiasts on the east coast! The meetup focused on users of Alluxio with … Continued
JD.com is China’s largest online retailer and its biggest overall retailer, as well as the country’s biggest internet company by revenue. Currently, JD.com’s BDP platform runs more than 400,000 jobs (15+ PB) daily, on a system with more than 15,000 cluster nodes and a total capacity of 210 PB.
Alluxio has run in JD.com’s production environment on 100 nodes for six months. See how JD.com uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization component.
The following is a guest post from our friends at Starburst Data. With more companies using Presto for reporting and analytics, we here at Starburst are seeing more use cases around operational reporting. These types of queries need to be returned subsecond and usually involve a small subset of the dataset. Presto was designed from the … Continued