Products
On-Demand Videos
video
AI/ML Infra Meetup | Open Source Michelangelo: Uber's Predictive to Generative end to end ML Lifecycle management platform

In this talk, Eric Wang, Senior Staff Software Engineer introduces Uber’s open-source generative end-to-end ML lifecycle management platform: Michelangelo.
video
AI/ML Infra Meetup | Unlock the Future of Generative AI: TorchTitan's Latest Breakthroughs

In this talk, Jiani Wang, Software Engineer Meta's Pytorch Team, dives into the overview and the latest advancements in TorchTitan.
video
AI/ML Infra Meetup | Bringing Data to GPUs Anywhere + Get Low-Latency on Object Store with Alluxio

In this talk, Bin Fan, VP of Technology at Alluxio, explores how to enable efficient data access across distributed GPU infrastructure, achieving low-latency performance for feature stores and RAG workloads.
.png)
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
video
Presto: Fast SQL-on-anything across data lakes, DBMS, and NoSQL Data stores
Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Proven at scale in a variety of use cases at Comcast, GrubHub, FINRA, LinkedIn, Lyft, Netflix, Slack, Zalando, in the last few years Presto experienced an unprecedented growth in popularity in both on-premises and cloud deployments over Object Stores, HDFS, NoSQL and RDBMS data stores.
Delta Lake, a storage layer originally invented by Databricks and recently open sourced, brings ACID capabilities to big datasets held in Object Storage. While initially designed for Spark, Delta Lake now supports multiple query compute engines including Presto.
In this talk we discuss how Presto enables query-time correlations between Delta Lake, Snowflake, and Elasticsearch to drive interactive BI analytics across disparate datasets.
Large Scale Analytics Acceleration
video
How Presto & Alluxio leverage our data-platform at Ryte
Presto & Alluxio on AWS: How we build a Up-To-Date Data-Platform at Ryte. Video: Presentation Slides: Introducing the Hub for Data Orchestration from Alluxio, Inc.
Large Scale Analytics Acceleration
Data Platform Modernization
video
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
This talk introduces T3Go’s solution in building an enterprise-level data lake based on Apache Hudi & Alluxio, and how to use Alluxio to accelerate the reading and writing of data on the data lake when compute and storage are segregated.
Large Scale Analytics Acceleration
Hybrid Multi-Cloud
Data Platform Modernization
video
Speeding Up Spark Performance using Alluxio at China Unicom
Unicom’s traditional batch architecture consists mainly of IOE, Hive, and Greenplum systems. With the development of business, a large number of computing application modules based on diverse scenarios, chimney-like, decentralized applications have emerged. To solve the problem of resource fragmentation, we have introduced a unified computing platform for computing ecology with Spark and Alluxio as the core. Alluxio plays an important role in accelerating data processing and ensuring process stability.
Large Scale Analytics Acceleration
video
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
Describe benefits and methods Alluxio enables secure data access in the Comcast’s dx hybrid data cloud.
- Review the data access challenges and tradeoffs in hybrid cloud
- Review our hybrid architecture and the important role Alluxio plays
- Provide performance metrics to highlight the benefits
Large Scale Analytics Acceleration
Hybrid Multi-Cloud
Data Platform Modernization
video
Bursting on-premise analytic workloads to Amazon EMR using Alluxio
Data infrastructure on-premises is increasingly complex and cloud adoption is attractive for business agility. Operating a hybrid environment is an approach to start benefiting from cloud elasticity quickly without abandoning the infrastructure on-premises. In this session I will discuss the benefits of using Alluxio’s Data Orchestration Platform to dynamically burst Apache Spark and Presto workloads to Amazon EMR for best performance and agility.
Large Scale Analytics Acceleration
Hybrid Multi-Cloud
video
Hybrid Data Lake on Google Cloud with Alluxio and Dataproc
Dataproc is Google’s managed Hadoop and Spark platform. In this talk, we will showcase how to swiftly build a hybrid cloud data platform with Alluxio and Presto and migrate data seamlessly.
Large Scale Analytics Acceleration
Hybrid Multi-Cloud
Data Platform Modernization
video
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Today, many people run deep learning applications with training data from separate storage such as object storage or remote data centers. This presentation will demo the Intel Analytics Zoo + Alluxio stack, an architecture that enables high performance while keeping cost and resource efficiency balanced without network being I/O bottlenecked.
Intel Analytics Zoo is a unified data analytics and AI platform open-sourced by Intel. It seamlessly unites TensorFlow, Keras, PyTorch, Spark, Flink, and Ray programs into an integrated pipeline, which can transparently scale from a laptop to large clusters to process production big data. Alluxio, as an open-source data orchestration layer, accelerates data loading and processing in Analytics Zoo deep learning applications.
This talk, we will go over:
- What is Analytics Zoo and how it works
- How to run Analytics Zoo with Alluxio in deep learning applications
- Initial performance benchmark results using the Analytics Zoo + Alluxio stack
Hybrid Multi-Cloud
Data Platform Modernization
Large Scale Analytics Acceleration
Model Training Acceleration
video
Fluid: When Alluxio Meets Kubernetes
Nowadays, cloud native environments have attracted lots of data-intensive applications deployed and ran on them, due to the efficient-to-deploy and easy-to-maintain advantages provided by cloud native platforms and frameworks such as Docker, Kubernetes. However, cloud native frameworks does not provide the data abstraction support to the applications natively. Therefore, we build Fluid project, which co-orchestrate data and containers together. We use Alluxio as the cache runtime inside Fluid to warm up hot data. In this report, we will introduce the design and effects of the Fluid project.
Large Scale Analytics Acceleration
video
Speeding Up Atlas Deep Learning Platform with Alluxio + Fluid
Unisound focuses on Artificial Intelligence services for the Internet of Things. It is an artificial intelligence company with completely independent intellectual property rights and the world’s top intelligent voice technology. Atlas is the Deep Learning platform within Unisound AI Labs, which provides deep learning pipeline support for hundreds of algorithm scientists. This talk shares three real business training scenarios that leverage Alluxio’s distributed caching capabilities and Fluid’s cloud native capabilities, and achieve significant training acceleration and solve platform IO bottlenecks. We hope that the practice of Alluxio & Fluid on Atlas platform will bring benefits to more companies and engineers.
Model Training Acceleration
Data Platform Modernization