Products
On-Demand Videos
video
AI/ML Infra Meetup | Open Source Michelangelo: Uber's Predictive to Generative end to end ML Lifecycle management platform

In this talk, Eric Wang, Senior Staff Software Engineer introduces Uber’s open-source generative end-to-end ML lifecycle management platform: Michelangelo.
video
AI/ML Infra Meetup | Unlock the Future of Generative AI: TorchTitan's Latest Breakthroughs

In this talk, Jiani Wang, Software Engineer Meta's Pytorch Team, dives into the overview and the latest advancements in TorchTitan.
video
AI/ML Infra Meetup | Bringing Data to GPUs Anywhere + Get Low-Latency on Object Store with Alluxio

In this talk, Bin Fan, VP of Technology at Alluxio, explores how to enable efficient data access across distributed GPU infrastructure, achieving low-latency performance for feature stores and RAG workloads.
.png)
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
video
Building an Open Data Platform with Apache Iceberg
ALLUXIO DAY VIII 2021
December 14, 2021
This talk will introduce Apache Iceberg and its place in a modern and open data platform. It will cover the motivation for creating Iceberg at Netflix, as well as the data architecture that Iceberg makes possible.
Large Scale Analytics Acceleration
video
Iceberg + Alluxio for Fast Data Analytics
ALLUXIO DAY VIII 2021
December 14, 2021
This talk provides an overview of the read-after-write data consistent mechanism in the Alluxio system. Alluxio Core Maintainer and Presto Committer share their recent work on Alluxio and Apache Iceberg integration, as well as some recent work from the Presto community on Iceberg connector.
Large Scale Analytics Acceleration
video
Best Practice in Accelerating Data Applications with Spark+Alluxio
ALLUXIO DAY VI 2021
October 12, 2021
Apache Spark and Alluxio were both born in UC Berkeley’s AMPLab as research projects. As an open source data orchestration platform, Alluxio is able to achieve seamless docking and acceleration of different data sources, and improve the efficiency and fault tolerance of Spark’s big data computing business.
Alluxio has been deployed and running on a large scale managing petabytes level data in the production environment of companies such as Microsoft, Tiktok, Tencent, Singapore Development Bank, China Unicom, etc.
This talk shares the designs and use cases of the Alluxio and Spark integrated solutions, as well as the best practice and “what not to do” in designing and implementing Alluxio distributed systems.
No items found.
video
Apache Hudi : The Path Forward
ALLUXIO DAY VI 2021
October 12, 2021
In this talk, we will provide a complete picture of the Hudi platform components, along with their unique design choices. We will then deep dive into two important areas of active development going forward – table metadata management and caching. Specifically, we will discuss gaps in the data lake ecosystem around these aspects and provide strawman design approaches for Hudi aims to solve them going forward.
No items found.
video
Enabling Presto Caching at Uber with Alluxio
ALLUXIO DAY VI 2021
October 12, 2021
This talk discusses the opportunities and problems when Uber meets Alluxio. Zhongting from Uber will provide an overview of Uber traffic, cloud, distribution, invalidation, and consistent hashing. Beinan from Alluxio will provide a deep dive of metadata and monitoring metrics.
Large Scale Analytics Acceleration
video
Improve Presto Architectural Decisions with Shadow Cache
ALLUXIO DAY VI 2021
October 12, 2021
This talk describes the design of shadow cache, a lightweight component to track the working set size of Alluxio cache. Shadow cache can keep track of the working set size over the past window dynamically, and is implemented by a series of bloom filters. We’ve deployed the shadow cache in Facebook Presto and leverage the result to understand the system bottleneck and help with routing design decisions.
Large Scale Analytics Acceleration
video
Accelerate Cloud Training with Alluxio
Alluxio’s capabilities as a Data Orchestration framework have encouraged users to onboard more of their data-driven applications to an Alluxio powered data access layer. Driven by strong interests from our open-source community, the core team of Alluxio started to re-design an efficient and transparent way for users to leverage data orchestration through the POSIX interface. This effort has a lot of progress with the collaboration with engineers from Microsoft, Alibaba and Tencent. Particularly, we have introduced a new JNI-based FUSE implementation to support POSIX data access, created a more efficient way to integrate Alluxio with FUSE service, as well as many improvements in relevant data operations like more efficient distributedLoad, optimizations on listing or calculating directories with a massive amount of files, which are common in model training. We will also share our engineering lessons and roadmap in future releases to support Machine Learning applications.
Model Training Acceleration
video
Speeding up TensorFlow and PyTorch with Alluxio
ALLUXIO WEBINAR
Driven by strong interests from our open source community, the Alluxio core engineering team re-designed things to come up with a more efficient and transparent way for users to leverage data orchestration through the POSIX interface. This enables much better performance for ML workloads where data is accessed via the POSIX interface.
In this 20 minute community session, you’ll hear from Lu Qiu, one of Alluxio’s lead engineers on the POSIX implementation project.
In this session, you’ll learn:
- How Alluxio’s new JNI-based FUSE implementation supports more efficient POSIX data access
- How improvements to multiple data operations, including distributedLoad, optimizations on listing or calculating directories with a massive amounts of files, etc., improve performance. In model training
- How these latest enhancements improve performance on TensorFlow and PyTorch training workloads, even with GPU-based training and compute
Model Training Acceleration
Model Distribution
Hybrid Multi-Cloud
video
Speed up large scale ML/DL offline inference job with Alluxio at Microsoft [Chinese]
ALLUXIO DAY V 2021
August 27, 2021
Model Training Acceleration
Model Distribution
video
Alluxio + K8s in a Cloud Native AI environment at BossZP [Chinese]
ALLUXIO DAY V 2021 August 27, 2021
Model Training Acceleration
Model Distribution
video
ML and Query Acceleration at MOMO with Alluxio [Chinese]
ALLUXIO DAY V 2021
August 27, 2021
Large Scale Analytics Acceleration
Model Training Acceleration
video
Speeding up Machine Learning in the Cloud with Alluxio on Kubernetes [Chinese]
ALLUXIO DAY V 2021
August 27, 2021
Model Training Acceleration
Model Distribution