Alluxio meetups, conferences, events and more
The latest Alluxio meetups, webinars, conferences and more

Alluxio Product School | Integrate Alluxio With Your Trino and Spark Workloads, Without Redefining Hive Tables
Past Events:
How to Develop and Operate Cloud-Native Data Platforms and Applications
This talk will overview two projects at Electronic Arts (EA) that address the mismatch by data orchestration: One project automatically generates configurations for all components in a large monitoring system, which reduces the daily average number of alerts from ~1000 to ~20. The other project introduces Alluxio for caching and unifying address space across ETL and analytics workloads, which substantially simplifies architecture, improves performance, and reduces ops overheads.
CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes Using Alluxio
In the on-prem days, one key performance optimization for Apache Hadoop or Apache Spark workloads is to run tasks on nodes with local HDFS data. However, while adoption of the Cloud & Kubernetes makes scaling compute workloads exceptionally easy, HDFS is often not an option. Effectively accessing data from cloud-native storage services like AWS S3 or even on-premises HDFS becomes harder as data locality is lost.
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Join us for this tech talk where we’ll introduce the Starburst Presto, Alluxio, and cloud object store stack for building a highly-concurrent and low-latency analytics platform.
Open source data orchestration for a disaggregated analytics stack
The rise of compute intensive workloads and the adoption of the cloud has driven organizations to adopt a decoupled architecture for modern workloads – one in which compute scales independently from storage. While this enables scaling elasticity, it introduces new problems – how do you co-locate data with compute, how do you unify data across multiple remote clouds, how do you keep storage and I/O service costs down and many more.
Speeding Up I/O for Machine Learning
This talk will guide the audience on how Alluxio can greatly simplify the data preparation phase in with remote and possibly multiple data sources. We will share the lessons and benchmark from Bill Zhao an engineer led in Apple when building a Machine Learning platform using Tensorflow, NFS, DC/OS and Alluxio.
Alluxio Structured Data Management: Optimizing Structured Data with Alluxio
In this office hour, we will go over an introduction and motivation of Alluxio Structured Data Management, an overview of the different services in Alluxio 2.1, and a demo using Alluxio Structured Data Management with Presto.
Improving Data Locality for Spark Jobs on Kubernetes Using Alluxio
One important performance optimization in Apache Spark is to schedule tasks on nodes with HDFS data nodes locally serving the task input data. However, more users are running Apache Spark natively on Kubernetes where HDFS is not an option. This office hour describes the concept and dataflow with respect to using the stack of Spark/Alluxio in Kubernetes with enhanced data locality even the storage service is outside or remote.
Accelerating data access with In-Memory datasets
Enabling Presto in the Cloud with Alluxio
The Presto Summit continues to bring together the best developers, engineers, data scientists, and executives from the Presto community to share how some of the largest and most innovative companies are using this technology to power their analytics platforms.