Products
On-Demand Videos
video
AI/ML Infra Meetup | Open Source Michelangelo: Uber's Predictive to Generative end to end ML Lifecycle management platform

In this talk, Eric Wang, Senior Staff Software Engineer introduces Uber’s open-source generative end-to-end ML lifecycle management platform: Michelangelo.
video
AI/ML Infra Meetup | Unlock the Future of Generative AI: TorchTitan's Latest Breakthroughs

In this talk, Jiani Wang, Software Engineer Meta's Pytorch Team, dives into the overview and the latest advancements in TorchTitan.
video
AI/ML Infra Meetup | Bringing Data to GPUs Anywhere + Get Low-Latency on Object Store with Alluxio

In this talk, Bin Fan, VP of Technology at Alluxio, explores how to enable efficient data access across distributed GPU infrastructure, achieving low-latency performance for feature stores and RAG workloads.
.png)
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
video
Modernizing Your Data Platform for Analytics and AI in the Hybrid Cloud Era
ALLUXIO WEBINAR
With data lakes expanding from on-prem to the cloud as well as increasing use of new object data stores, data platform teams are challenged with providing consistent, high-throughput access to distributed data sources for analytics and AI/ML applications. In today’s hybrid cloud and multi-cloud era, data-intensive applications such as Presto, Spark, Hive, and Tensorflow are suffering more sluggish response times and increased complexity with the growing separation of data and compute.
Join Alluxio’s distributed systems experts as they explore today’s data access challenges and open source data orchestration solutions for modernizing your data platform.
In this tech talk, you’ll learn:
- How data access and throughput challenges are hindering large-scale analytics and AI/ML applications
- How a data orchestration layer can simplify distributed data access and improve performance
- Real-world production use cases and example journeys for architecting a modern data platform
Large Scale Analytics Acceleration
Model Training Acceleration
Cloud Cost Savings
Hybrid Multi-Cloud
Data Platform Modernization
video
Alluxio for Machine Learning Workloads
ALLUXIO DAY IV 2021
June 24, 2021
Driven by strong interests from our open-source community, the core team of Alluxio started to re-design an efficient and transparent way for users to leverage data orchestration through the POSIX interface. We have introduced a new JNI-based FUSE implementation to support POSIX data access, as well as many improvements in relevant data operations like more efficient distributedLoad, optimizations on listing or calculating directories with a massive amount of files, which are common in model training.
Model Training Acceleration
Model Distribution
video
Accelerating analytics workloads with Alluxio data orchestration and Intel® Optane™ persistent memory
ALLUXIO DAY IV 2021
June 24, 2021
Today’s analytics workloads demand real-time access to expansive amounts of data. This session demonstrates how Alluxio’s data orchestration platform, running on Intel Optane persistent memory, accelerates access to this data and uncovers its valuable business insights faster.
Large Scale Analytics Acceleration
video
RaptorX: Building a 10X Faster Presto with hierarchical cache
ALLUXIO DAY IV 2021
June 24, 2021
RaptorX is an internal project name aiming to boost query latency significantly beyond what vanilla Presto is capable of. For this session, we introduce the hierarchical cache work including Alluxio data cache, fragment result cache, etc. Cache is the key building block for RaptorX. With the support of the cache, we are able to boost query performance by 10X. This new architecture can beat performance oriented connectors like Raptor with the added benefit of continuing to work with disaggregated storage.
Large Scale Analytics Acceleration
video
Improving Presto performance with Alluxio at TikTok
ALLUXIO DAY IV 2021
June 24, 2021
Nowadays it is not straightforward to integrate Alluxio with popular query engines like Presto on existing Hive data. Solutions proposed by the community like Alluxio Catalog Service or Transparent URI brings unnecessary pressure on Alluxio masters when querying files should not be cached. This talk covers TikTok’s approach on adopting Alluxio for the cache layer without introducing additional services.
Large Scale Analytics Acceleration
video
setting-up-monitoring-system-for-alluxio-with-prometheus-and-grafana-in-10-minutes
ALLUXIO DAY IV 2021
June 24, 2021
Alluxio has an excellent metrics system and supports various kinds of metrics, e.g. an embedded JSON sink and the prometheus sink. Users and developers can easily create a custom sink of Alluxio by implementing the Sink interface.
Also, Alluxio provides a metrics page in web UI to display some key information of Alluxio, such as bytes throughput and storage space. However, if you want a more flexible and universal monitoring, additional work is required.
No items found.
video
Building a high-performance data lake analytics engine at Alibaba Cloud with Presto+Alluxio
ALLUXIO DAY III 2021
April 27, 2021
Data Lake Analytics(DLA) is a large scale serverless data federation service on Alibaba Cloud. One of its serverless analytics engine is based on Presto. The DLA Presto engine supports a variety of data sources and is widely used in different application scenarios in the cloud. In this session, we will talk about the system architecture of DLA Presto engine, as well as the challenges and solutions. In particular, we will introduce the use of alluxio local cache to solve performance issues on OSS data sources caused by access delay and OSS bandwidth limitation. We will discuss the principle of alluxio local cache and some improvements we have made.
Large Scale Analytics Acceleration
video
Speed up large-scale ML/DL offline inference job with Alluxio
ALLUXIO DAY III 2021
April 27, 2021
Increasingly powerful compute accelerators and large training dataset have made the storage layer a potential bottleneck in deep learning training/inference.
Offline inference job usually consumes and produces tens of tera-bytes data while running more than 10 hours.
For a large-scale job, it usually causes high IO pressure, increase job failure rate, and bring many challenges for system stability.
We adopt alluxio which acts as an intermediate storage tier between the compute tier and cloud storage to optimize IO throughput of deep learning inference job.
For the production workload, the performance improves 18% and we seldom see job failure because of storage issue.
Model Training Acceleration
Model Distribution
video
Alluxio-FUSE as a data access layer for Dask
ALLUXIO DAY III 2021
April 27, 2021
At Aspect Analytics we intend to use Dask, a distributed computation library for Python, to deal with MSI data stored as large tensors. In this talk we explore using Alluxio and Alluxio FUSE as a data consolidation and caching layer for some of our bioinformatics workflows.
Model Training Acceleration
video
Alluxio Data Orchestration for Machine Learning
ALLUXIO DAY III 2021
April 27, 2021
Alluxio’s capabilities as a Data Orchestration framework have encouraged users to onboard more of their data-driven applications to an Alluxio powered data access layer. Driven by strong interests from our open-source community, the core team of Alluxio started to re-design an efficient and transparent way for users to leverage data orchestration through the POSIX interface. This effort has a lot of progress with the collaboration with engineers from Microsoft, Alibaba and Tencent. Particularly, we have introduced a new JNI-based FUSE implementation to support POSIX data access, created a more efficient way to integrate Alluxio with FUSE service, as well as many improvements in relevant data operations like more efficient distributedLoad, optimizations on listing or calculating directories with a massive amount of files, which are common in model training. We will also share our engineering lessons and roadmap in future releases to support Machine Learning applications.
Model Training Acceleration
Model Distribution
Hybrid Multi-Cloud
Cloud Cost Savings
video
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
ALLUXIO DAY III 2021
April 27, 2021
RAPIDS is a set of open source libraries enabling GPU aware scheduling and memory representation for analytics and AI. Spark 3.0 uses RAPIDS for GPU computing to accelerate various jobs including SQL and DataFrame. With compute acceleration from massive parallelism on GPUs, there is a need for accelerating data access and this is what Alluxio enables for compute in any cloud. In this talk, you will learn how to use Alluxio and Spark with RAPIDS Accelerator on NVIDIA GPUs without any application changes.
Model Training Acceleration
Data Platform Modernization
video
Introducing what’s new in Alluxio 2.5
ALLUXIO COMMUNITY OFFICE HOUR
We are thrilled to announce the release of Alluxio 2.5!
Alluxio 2.5 focuses on improving interface support to broaden the set of data driven applications which can benefit from data orchestration. The POSIX and S3 client interfaces have greatly improved in performance and functionality as a result of the widespread usage and demand from AI/ML workloads and system administration needs. Alluxio is rapidly evolving to meet the needs of enterprises that are deploying it as a key component of their AI/ML stacks.
At the same time, Alluxio continues to integrate with the latest cloud and cluster orchestration technologies. In 2.5, Alluxio has new connectors for Google Cloud Storage and Azure Data Lake Storage Gen 2 as well as better operability functionality for Kubernetes environments.
In this Office Hour, we will go over:
- JNI Based POSIX API
- S3 Northbound API
- ADLS Gen 2 Connector
- GCSv2 Connector
Hybrid Multi-Cloud
Large Scale Analytics Acceleration