Alluxio meetups, conferences, events and more

The latest Alluxio meetups, webinars, conferences and more

Events

Past Events:

Open source data orchestration for a disaggregated analytics stack

Bangalore Presto Meetup *

The rise of compute intensive workloads and the adoption of the cloud has driven organizations to adopt a decoupled architecture for modern workloads – one in which compute scales independently from storage. While this enables scaling elasticity, it introduces new problems – how do you co-locate data with compute, how do you unify data across multiple remote clouds, how do you keep storage and I/O service costs down and many more.

Speeding Up I/O for Machine Learning

Alluxio Global Online Meetup *

This talk will guide the audience on how Alluxio can greatly simplify the data preparation phase in with remote and possibly multiple data sources. We will share the lessons and benchmark from Bill Zhao an engineer led in Apple when building a Machine Learning platform using Tensorflow, NFS, DC/OS and Alluxio.

Improving Data Locality for Spark Jobs on Kubernetes Using Alluxio

Alluxio Community Office Hour *

One important performance optimization in Apache Spark is to schedule tasks on nodes with HDFS data nodes locally serving the task input data. However, more users are running Apache Spark natively on Kubernetes where HDFS is not an option. This office hour describes the concept and dataflow with respect to using the stack of Spark/Alluxio in Kubernetes with enhanced data locality even the storage service is outside or remote.

Enabling Presto in the Cloud with Alluxio

Presto Summit New York *

The Presto Summit continues to bring together the best developers, engineers, data scientists, and executives from the Presto community to share how some of the largest and most innovative companies are using this technology to power their analytics platforms.

Ultra-fast SQL Analytics using PAS (Presto on Alluxio Stack)

Presto Meetup *

Presto is widely used for data science, business analytics, and operations. Presto’s SQL is a main driver for this, as it is ANSI-compliant, easy to ramp-up, and has rich functionality. Given the versatility and flexibility of this software, there is also a huge demand to develop interfaces for other critical data domains like real-time dashboards, stream processing, and large-scale batch computations. We will explore some interesting systems and prototypes to bring Presto to these new domains.

Improving Memory Utilization of Spark Jobs Using Alluxio

Alluxio Community Office Hour *

This office hour shares a demo and compares two approaches, caching data directly in-memory into the Spark JVM versus storing data off-heap via an in-memory storage service like Alluxio