Products
Alluxio AI Infra Day 2024
.png)

AI Infra Day | The AI Infra in the Generative AI Era

AI Infra Day | Accelerate Your Model Training and Serving with Distributed Caching

AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale

AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta

AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Update

AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kubernetes
.jpeg)

Blog
.jpeg)
Blog
Accelerating Analytics by 200% with Impala, Alluxio, and HDFS at Tencent
In this article, Honghan Tian describes how engineers in the Data Service Center (DSC) at Tencent PCG (Platform and Content Business Group) leverages Alluxio to optimize the analytics performance and minimize the operating costs in building Tencent Beacon Growing, a real-time data analytics platform.
Large Scale Analytics Acceleration


Blog

Blog
Efficient Model Training in the Cloud with Kubernetes, TensorFlow, and Alluxio
This article presents the collaboration of Alibaba, Alluxio, and Nanjing University in tackling the problem of Deep Learning model training in the cloud. Various performance bottlenecks are analyzed with detailed optimizations of each component in the architecture. Our goal was to reduce the cost and complexity of data access for Deep Learning training in a hybrid environment, which resulted in over 40% reduction in training time and cost.
Hybrid Multi-Cloud
GPU Acceleration
Model Training Acceleration


Blog

Blog
Alluxio Accelerates Deep Learning in Hybrid Cloud using Intels Analytics Zoo open source platform powered by oneAPI
This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed.
Model Training Acceleration
Hybrid Multi-Cloud
Large Scale Analytics Acceleration
.jpeg)

Blog
.jpeg)
Blog
Everything you want to know about how to decouple SQL engines from Hive Data Warehouse
Are you using SQL engines, such as Presto, to query existing Hive data warehouse and experiencing challenges including overloaded Hive Metastore with slow and unpredictable access, unoptimized data formats and layouts such as too many small files, or lack of influence over the existing Hive system and other Hive applications?
Large Scale Analytics Acceleration


Presentation

Presentation
CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes Using Alluxio
In the on-prem days, one key performance optimization for Apache Hadoop or Apache Spark workloads is to run tasks on nodes with local HDFS data. However, while adoption of the Cloud & Kubernetes makes scaling compute workloads exceptionally easy, HDFS is often not an option. Effectively accessing data from cloud-native storage services like AWS S3 or even on-premises HDFS becomes harder as data locality is lost.
Originated from UC Berkeley AMPLab, the open source project Alluxio approaches this problem in a new way by helping to move data closer to compute workloads efficiently and on-demand, and unify data across multiple or remote clouds, and many more. This webinar will describe the concept and internal mechanism using the stack of Spark+Alluxio in Kubernetes to enhance data locality even when the storage service is outside or remote.
Particularly, we will go over:
- Why Spark is able to make a locality-aware schedule when working with Alluxio in K8s environment using the host network
- Why a pod running Alluxio can share data efficiently with a pod running Spark on the same host using domain socket and host path volume
- The roadmap of Alluxio to further improve running analytics jobs like Spark and Presto, including the on-going closer integration with Presto
No items found.
Your selections don't match any items.
.jpeg)
.jpeg)