Alluxio Community Newsletter - May 2023

Highlights

Webinar | From Idle to Optimal: Maximize GPU Utilization for Model Training | June 26, 11:00am PT

When training models on ultra-large datasets, one of the biggest challenges is low GPU utilization due to inefficient I/O and data access. In this webinar, Tarik and Beinan will discuss strategies for cost-effective management of ultra-large datasets for AI and analytics.

What you will learn:

The challenges of I/O stalls leading to low GPU utilization for model training
High-performance, high-throughput data access (I/O) strategies
The benefits of using an on-demand data access layer over your storage
How Uber addresses managing ultra-large datasets using high-density storage and caching

Register Now

Millions Saved Annually: Unleashing the Power of Alluxio + HDFS at Uber

Recently, Uber’s HDFS team has posted a blog detailing our joint project aimed at optimizing the performance of HDFS DataNodes. The project utilized the Alluxio SDK cache to manage an SSD storage on each DataNode, resulting in improved performance and a better return on investment. Despite the SSD cache occupying only 0.6% of the total disk space, it impressively handles 60% of the overall client traffic.

Read More

May On-demand Webinar | Distributed Caching for Generative AI

In this Alluxio-hosted webinar, Shouwei presented on the design and implementation of a distributed caching system that addresses the I/O challenges of LLM training and inference. He explored the unique requirements of data access patterns and offer practical best practices for optimizing the data pipeline through distributed caching in the cloud. Watch this recording to get a deeper understanding of how to harness scalable, efficient, and robust data infrastructures for LLM training and inference.

Read More

gOOD READS AND TUTORIALS

New eBook! The Presto Optimization Handbook

Get the best practices that have helped industry giants like Meta, Uber, and Walmart improve query performance by 3~10x. You will learn:

How Presto query engine runs queries under the hood
Identifying bottlenecks that impact query performance in the query lifecycle
Refining Presto for optimal query performance
Seven best practices for maximizing Presto query efficiency, including configuration settings, session properties, and SQL statements
Presto optimizations at Uber scale and Fortune 1 scale

If you’re using Trino (formerly PrestoSQL), check out The Trino Optimization Handbook here.

Read More

Short Video Series | Getting Started with Alluxio on Kubernetes

We have a new series coming out – Getting Started with Alluxio on Kubernetes! In this 3 part series, learn about the architecture, deployment and best practices of Alluxio on Kubernetes with Shawn Sun, Software Engineer at Alluxio.

Watch Part I

Subscribe Now

2023 Intellyx Digital Innovator Award

We’re thrilled to share that Alluxio has been honored with the 2023 Intellyx Digital Innovator Award! 🏆 This recognition highlights our position as one of the most disruptive, interesting, and enterprise-relevant vendors in the digital transformation landscape.

Read More

Past events on-demand

Tech Talk On-demand | Open Source Summit North America, Vancouver

Our community team recently spoke at Open Source Summit North America in Vancouver, the premier open source developer and community contributor event. Check out their sessions!

Jasmine Wang (Head of Community & Developer Relations) – “Lessons Learned in Building an Interdependent Open Source Team – Team Design, Strategy, Metrics”
Lu Qiu (Machine Learning Engineer & PMC Maintainer) – “How to Eliminate the I/O Bottleneck and Continuously Feed the GPU While Training in the Cloud”

Upcoming Events

Trino Fest | June 14-15

We’re excited to be speaking at Trino Fest: Lakehouse Summer Camp. Trino Fest is a virtual, free, 2-day event on June 14 and 15. Beinan Wang and Hope Wang are leading a session on Trino optimization with distributed caching on data lake. You will learn the innovations in caching for Trino, employing affinity scheduling and node-local caching with some real-world examples

Data + AI Summit | June 27-29

Alluxio is a Sponsor of Data & AI Summit. Visit us at booth #29, Alluxio will be offering free architecture assessments and cloud savings estimates. Book your free assessment above and a Solutions Engineer will reach out to you to confirm a time slot.

In the talk session “Data Caching Strategies for Analytics and AI”, Beinan Wang, Tech Lead at Alluxio and Chunxu Tang, Research Scientist at Alluxio, will share their observations on data access patterns in the analytical SQL and AI training domains based on their practical experience with large-scale systems. Register now with code SPCUSrodttusd to get caching recommendations for different use cases.

Got a tech question for the Alluxio Community? Chat with us on Slack!

WHITEPAPERS

“Zero-Copy” Hybrid Bursting with no App Changes

Alluxio Architecture and Data Flow

Evaluating Apache Spark and Alluxio for Data Analytics Benchmarking Recommendations and Results

Spark with Alluxio Overview – Pair Spark with Alluxio to Modernize Your Data Platform

Presto with Alluxio Overview – Architecture Evolution for Interactive Queries

Accelerating Machine Learning / Deep Learning in the Cloud: Architecture and Benchmark

Be our stargazers on GitHub ⭐

If you like our product, please give it a star on GitHub, and share the goodness!

HOT JOBS

We currently have 30+ opportunities across the globe! Learn more about our job openings in Customer Success, Sales, Product, and Engineering teams. Are you awesome or know of anyone to refer? Check out the full list of opportunities and apply here.

Senior Account Support Engineer (San Mateo, California)

Senior Solutions Engineer (San Mateo, California)

Senior Account Executive (San Mateo, California)

Software Engineering Manager (San Mateo, California)