This talk describes a stack of open-source projects to serve high-concurrent and low-latency SQL queries using Presto with Alluxio on big data in the … Continued
Slides from our latest talks
Google Cloud Dataproc is a widely used fully managed Spark and Hadoop service to run big data analytics and compute workloads in the cloud. … Continued
If you’re a MapR user, you might have concerns with your existing data stack. Whether it’s the complexity of Hadoop, financial instability and no … Continued
Many Spark users may not be aware of the differences in memory utilization between caching data directly in-memory into the Spark JVM versus storing … Continued
Alluxio, an open source data orchestration technology, helping speed up Dataproc workloads by providing a distributed caching layer in the Dataproc Cluster. … Continued
This talk describes a stack of open-source projects to serve high-concurrent and low-latency SQL queries using Presto with Alluxio on big data in the … Continued
JD.com is China’s largest online retailer. It uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and … Continued
The DBS team was tasked to solve their compute capacity problem. They wanted to provide faster insights and analyze data for a range of … Continued
In this talk, HY discussed the key challenges and trends impacting data engineering, and explores the concept of Data Orchestration. … Continued