Products
Alluxio AI Infra Day 2024
.png)

AI Infra Day | The AI Infra in the Generative AI Era

AI Infra Day | Accelerate Your Model Training and Serving with Distributed Caching

AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale

AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta

AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Update

AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kubernetes


Blog

Blog
Building HighPerformance Data Lake Using Apache Hudi and Alluxio at T3Go
How T3Go's high-performance data lake using Apache Hudi and Alluxio shortened the time for data ingestion into the lake by up to a factor of 2. Data analysts using Presto, Hudi, and Alluxio in conjunction to query data on the lake saw queries speed up by 10 times faster.
Large Scale Analytics Acceleration
.jpeg)

Blog
.jpeg)
Blog
Announcing Alluxio Data Orchestration Hub
We’re pleased to announce the general availability of Alluxio Data Orchestration Hub, your single pane of glass to orchestrate data for analytics and AI. The data ecosystem is complex with the separation of storage and compute across data centers and cloud providers. With this release we’ve made great strides towards simplifying data access and management across multiple environments.
Large Scale Analytics Acceleration
.jpeg)

Blog
.jpeg)
Blog
Data Consistency Model in Alluxio
Unlike HDFS which provides one-copy update semantics or AWS S3 which provides eventual consistency, data consistency in Alluxio is a bit more complicated and depends on the configuration. In short, when clients are only reading and writing through Alluxio, the Alluxio file system provides strong consistency. However, when clients are writing data across both Alluxio and under storage, the consistency may depend on the write type and under storage type.
No items found.
.jpeg)

Blog
.jpeg)
Blog
Whats new in Alluxio 2.4
Alluxio 2.4.0 focuses on features critical to large scale, production deployments in Cloud and Hybrid Cloud environments. Enterprises leverage Alluxio at enormous scale in many dimensions, including number of files, total volume of data, requests per second, and number of concurrent clients.
No items found.


Presentation

Presentation
PrestoCon 2020: Enabling Ultra-fast Presto in the Cloud with Alluxio
In this presentation, Haoyuan Li shares an overview of PAX (Presto Alluxio Stack), its related industry trends, and how PAX solves challenges and brings values to its hundreds of users in the cloud.
No items found.
.jpeg)

Blog
.jpeg)
Blog
Adopting Satellite Clusters with Alluxio at Vipshop to Improve Spark Jobs for Targeted Advertising by 30x
As the third largest e-commerce site in China, Vipshop processes large amounts of data collected daily to generate targeted advertisements for its consumers. In this article, Gang Deng from Vipshop describes how to meet SLAs by improving struggling Spark jobs on HDFS by up to 30x, and optimize hot data access with Alluxio to create a reliable and stable computation pipeline for e-commerce targeted advertising.
Large Scale Analytics Acceleration


Blog

Blog
Building a CrossRegion Hybrid Cloud Storage Gateway for Machine Learning AI at WeRide
In this blog, Derek Tan, Executive Director of Infra & Simulation at WeRide, describes how engineers leverage Alluxio as a hybrid cloud data gateway for applications on-premises to access public cloud storage like AWS S3.
Hybrid Multi-Cloud
Model Training Acceleration


Blog

Blog
Introducing Alluxio 2.3
Alluxio 2.3.0 focuses on streamlining the user experience in hybrid cloud deployments where Alluxio is deployed with compute in the cloud to access data on-prem. Features such as environment validation tools and concurrent metadata synchronization greatly improve Alluxio’s functionality. Integrations with AWS EMR, Google Dataproc, K8s, and AWS Glue make Alluxio easy to use in a variety of cloud environments. In this article, we will share some of the highlights of the release. For more, please visit our release notes page.
No items found.
.jpeg)

Blog
.jpeg)
Blog
Accelerating Analytics by 200% with Impala, Alluxio, and HDFS at Tencent
In this article, Honghan Tian describes how engineers in the Data Service Center (DSC) at Tencent PCG (Platform and Content Business Group) leverages Alluxio to optimize the analytics performance and minimize the operating costs in building Tencent Beacon Growing, a real-time data analytics platform.
Large Scale Analytics Acceleration


Blog

Blog
Efficient Model Training in the Cloud with Kubernetes, TensorFlow, and Alluxio
This article presents the collaboration of Alibaba, Alluxio, and Nanjing University in tackling the problem of Deep Learning model training in the cloud. Various performance bottlenecks are analyzed with detailed optimizations of each component in the architecture. Our goal was to reduce the cost and complexity of data access for Deep Learning training in a hybrid environment, which resulted in over 40% reduction in training time and cost.
Hybrid Multi-Cloud
GPU Acceleration
Model Training Acceleration


Blog

Blog
Alluxio Accelerates Deep Learning in Hybrid Cloud using Intels Analytics Zoo open source platform powered by oneAPI
This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed.
Model Training Acceleration
Hybrid Multi-Cloud
Large Scale Analytics Acceleration
Your selections don't match any items.