This talk will go over a generic example of stateful coordination service moving from Zookeeper to Raft.
Alluxio meetups, conferences, events and more
The latest Alluxio meetups, webinars, conferences and more
In this office hour, we will go over an introduction and motivation of Alluxio Structured Data Management, an overview of the different services in Alluxio 2.1, and a demo using Alluxio Structured Data Management with Presto.
One important performance optimization in Apache Spark is to schedule tasks on nodes with HDFS data nodes locally serving the task input data. However, more users are running Apache Spark natively on Kubernetes where HDFS is not an option. This office hour describes the concept and dataflow with respect to using the stack of Spark/Alluxio in Kubernetes with enhanced data locality even the storage service is outside or remote.
The Presto Summit continues to bring together the best developers, engineers, data scientists, and executives from the Presto community to share how some of the largest and most innovative companies are using this technology to power their analytics platforms.
Presto is widely used for data science, business analytics, and operations. Presto’s SQL is a main driver for this, as it is ANSI-compliant, easy to ramp-up, and has rich functionality. Given the versatility and flexibility of this software, there is also a huge demand to develop interfaces for other critical data domains like real-time dashboards, stream processing, and large-scale batch computations. We will explore some interesting systems and prototypes to bring Presto to these new domains.
This office hour shares a demo and compares two approaches, caching data directly in-memory into the Spark JVM versus storing data off-heap via an in-memory storage service like Alluxio
Learn how to set up Google Cloud Dataproc with Alluxio so jobs can seamlessly read from and write to Cloud Storage. See how to run Dataproc Spark against a remote HDFS cluster.
In this tech talk, we’ll discuss why DBS turned to Alluxio’s bursting approach to help solve on-prem compute capacity challenges.
Running Spark with Alluxio is a popular stack particularly for hybrid environments. In this session, Dipti will briefly introduce Alluxio, share the top 10 tips for performance tuning for real-world workloads, and demo Alluxio with Spark.