For many latency-sensitive SQL workloads, Presto is often bound by retrieving distant data. In this talk, Rohit Jain, James Sun from Facebook and Bin … Continued
On-Demand Videos
In this talk, we will show you how to leverage any public cloud (AWS, Google Cloud Platform, or Microsoft Azure) to scale analytics workloads … Continued
Today, many people run deep learning applications with training data from separate storage such as object storage or remote data centers. This presentation will … Continued
Alluxio (alluxio.io) is an open-source data orchestration system that provides a single namespace federating multiple external distributed storage systems. It is critical for Alluxio … Continued
Ideally, Presto would access data independently from how the data was originally stored or managed. Alluxio, as a data orchestration layer provides the physical … Continued
Accessing data to run analytic workloads in Spark across data centers and/or clouds can be challenging. Additionally, network I/O can bottleneck Spark jobs that … Continued
Building distributed systems is no small feat. Software testing is just one of many critical practices that engineers who build these systems need to … Continued
Many organizations are leveraging EMR to run big data analytics on public cloud. However, reading and writing data to S3 directly can result in … Continued
This talk will overview two projects at Electronic Arts (EA) that address the mismatch by data orchestration: One project automatically generates configurations for all … Continued