Alluxio - Blog

AI/ML Infra Meetup at Uber Seattle: Tackling Scalability Challenges of AI Platforms

Insights from from Uber, Snap, and Alluxio on LLM training, fine-tuning, deployment, designing scalable architectures, GPU optimization, and building recommendations systems.

New Features in Alluxio Enterprise AI 3.5

With the new year comes new features in Alluxio Enterprise AI! Just weeks into 2025 and we are already bringing you exciting new features to better manage, scale, and secure your AI data with Alluxio. From advanced cache management and improved write performance to our Python SDK and S3 API enhancements, our latest release of Alluxio Enterprise AI delivers more power and performance to your AI workloads. Without further ado, let’s dig into the details.

‍

Alluxio Enterprise for Data Analytics Scales to New Heights

We are thrilled to announce the general availability of Alluxio Enterprise for Data Analytics 3.2! With data volumes continuing to grow at exponential rates, data platform teams face challenges in maintaining query performance, managing infrastructure costs, and ensuring scalability. This latest version of Alluxio addresses these challenges head-on with groundbreaking improvements in scalability, performance, and cost-efficiency.

Introducing Rapid Alluxio Deployer On AWS: Experience The Benefits Of Alluxio Enterprise AI In A Few Clicks

We’re excited to introduce Rapid Alluxio Deployer (RAD) on AWS, which allows you to experience the performance benefits of Alluxio in less than 30 minutes. RAD is designed with a split-plane architecture, which ensures that your data remains secure within your AWS environment, giving you peace of mind while leveraging Alluxio’s capabilities.

Six Tips To Optimize PyTorch for Faster Model Training

PyTorch is one of the most popular deep learning frameworks in production today. As models become increasingly complex and dataset sizes grow, optimizing model training performance becomes crucial to reduce training times and improve productivity.

‍

AI/ML Infra Meetup Highlights Key Takeaways

Co-hosted by Alluxio and Uber on May 23, 2024, AI/ML Infra Meetup was the community event for developers focused on building AI, ML and data infrastructure at scale. We were thrilled by the overwhelming interest and enthusiasm in our meetup!

Trino and Alluxio: Better Together

This blog post delves into the history behind Trino introducing Alluxio as a replacement for RubiX as a file system cache. It explores the synergy between Trino and Alluxio, assesses which type of cache best suits various needs, and shares real-world examples of Trino and Alluxio adoption.

‍

Whats New In Alluxio Enterprise AI 3.2: GPU Acceleration, Python Filesystem API, Write Checkpointing and More

Performance, cache operability, and cost efficiency are key considerations for AI platform teams supporting large scale model training and distribution. In 2023, we launched Alluxio Enterprise AI, for managing AI training and model distribution I/O across diverse environments, whether in a single storage with diverse computing clusters or in a more complex multi-cloud, multi-data center environment.

How Can AI Platforms Adapt to Hybrid or Multi-Cloud Environments

This article was originally published on Spiceworks. https://www.spiceworks.com/tech/artificial-intelligence/guest-article/adapting-ai-platform-to-hybrid-cloud/

This blog discusses the challenges of implementing AI platforms in hybrid and multi-cloud environments and shares examples of organizations that have prioritized security and optimized cost management using the data access layer.

Maximize GPU Utilization for Model Training

GPU utilization or GPU usage, is the percentage of GPUs’ processing power being used at a particular time. As GPUs are expensive resources, optimizing their utilization and reducing idle time is essential for enterprise AI infrastructure. This blog explores bottlenecks hindering GPU utilization during model training and provides solutions to maximize GPU utilization.

‍

IWD 2024: Empower Women Developers in the Open-Source Community

This article was originally published on ITBrief. The author is Hope Wang, Developer Advocate, Alluxio.

As we celebrate International Women's Day, it is important to reflect on the progress we have made toward gender equality in the tech industry, particularly in open-source software (OSS). While there is still much work to be done, I am proud to be part of a community actively working to empower women and promote diversity. In this article, I want to share my path to the open-source community and offer advice to women developers interested in contributing to open-source projects.

Accelerating Data Loading in Large-Scale ML Training With Ray and Alluxio

In the rapidly-evolving field of artificial intelligence (AI) and machine learning (ML), the efficient handling of large datasets during training is becoming more and more pivotal. Ray has emerged as a key player, enabling large-scale dataset training through effective data streaming. By breaking down large datasets into manageable chunks and dividing training jobs into smaller tasks, Ray circumvents the need for local storage of the entire dataset on each training machine. However, this innovative approach is not without its challenges.

‍

Your selections don't match any items.

Alluxio Enterprise AI

Alluxio Enterprise Data

Blog

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer