Alluxio - Blog

Alluxio Enterprise for Data Analytics Scales to New Heights

We are thrilled to announce the general availability of Alluxio Enterprise for Data Analytics 3.2! With data volumes continuing to grow at exponential rates, data platform teams face challenges in maintaining query performance, managing infrastructure costs, and ensuring scalability. This latest version of Alluxio addresses these challenges head-on with groundbreaking improvements in scalability, performance, and cost-efficiency.

Introducing Rapid Alluxio Deployer On AWS: Experience The Benefits Of Alluxio Enterprise AI In A Few Clicks

We’re excited to introduce Rapid Alluxio Deployer (RAD) on AWS, which allows you to experience the performance benefits of Alluxio in less than 30 minutes. RAD is designed with a split-plane architecture, which ensures that your data remains secure within your AWS environment, giving you peace of mind while leveraging Alluxio’s capabilities.

Six Tips To Optimize PyTorch for Faster Model Training

PyTorch is one of the most popular deep learning frameworks in production today. As models become increasingly complex and dataset sizes grow, optimizing model training performance becomes crucial to reduce training times and improve productivity.

‍

AI/ML Infra Meetup Highlights Key Takeaways

Co-hosted by Alluxio and Uber on May 23, 2024, AI/ML Infra Meetup was the community event for developers focused on building AI, ML and data infrastructure at scale. We were thrilled by the overwhelming interest and enthusiasm in our meetup!

Trino and Alluxio: Better Together

This blog post delves into the history behind Trino introducing Alluxio as a replacement for RubiX as a file system cache. It explores the synergy between Trino and Alluxio, assesses which type of cache best suits various needs, and shares real-world examples of Trino and Alluxio adoption.

‍

Whats New In Alluxio Enterprise AI 3.2: GPU Acceleration, Python Filesystem API, Write Checkpointing and More

Performance, cache operability, and cost efficiency are key considerations for AI platform teams supporting large scale model training and distribution. In 2023, we launched Alluxio Enterprise AI, for managing AI training and model distribution I/O across diverse environments, whether in a single storage with diverse computing clusters or in a more complex multi-cloud, multi-data center environment.

How Can AI Platforms Adapt to Hybrid or Multi-Cloud Environments

This article was originally published on Spiceworks. https://www.spiceworks.com/tech/artificial-intelligence/guest-article/adapting-ai-platform-to-hybrid-cloud/

This blog discusses the challenges of implementing AI platforms in hybrid and multi-cloud environments and shares examples of organizations that have prioritized security and optimized cost management using the data access layer.

Maximize GPU Utilization for Model Training

GPU utilization or GPU usage, is the percentage of GPUs’ processing power being used at a particular time. As GPUs are expensive resources, optimizing their utilization and reducing idle time is essential for enterprise AI infrastructure. This blog explores bottlenecks hindering GPU utilization during model training and provides solutions to maximize GPU utilization.

‍

IWD 2024: Empower Women Developers in the Open-Source Community

This article was originally published on ITBrief. The author is Hope Wang, Developer Advocate, Alluxio.

As we celebrate International Women's Day, it is important to reflect on the progress we have made toward gender equality in the tech industry, particularly in open-source software (OSS). While there is still much work to be done, I am proud to be part of a community actively working to empower women and promote diversity. In this article, I want to share my path to the open-source community and offer advice to women developers interested in contributing to open-source projects.

Accelerating Data Loading in Large-Scale ML Training With Ray and Alluxio

In the rapidly-evolving field of artificial intelligence (AI) and machine learning (ML), the efficient handling of large datasets during training is becoming more and more pivotal. Ray has emerged as a key player, enabling large-scale dataset training through effective data streaming. By breaking down large datasets into manageable chunks and dividing training jobs into smaller tasks, Ray circumvents the need for local storage of the entire dataset on each training machine. However, this innovative approach is not without its challenges.

‍

The Best Content of 2023: Our Favorite Things

2023 is over, so we’ve compiled a collection of 2023’s most popular content according to our readers. In case you missed anything, here’s your chance to catch up on best practices ebooks, technical blogs, hands-on videos, webinars and more.

Enjoy!

Setting the Stage for Alluxio Community to Soar in the Year of the Dragon: 2023 Recap and 2024 Outlook

As we step into 2024, we look back and celebrate an incredible year of 2023 for the Alluxio community.

First and foremost, thank you to all of our contributors and the broader community! Together, we have achieved remarkable milestones. 💖

‍

Your selections don't match any items.

Alluxio Enterprise AI

Alluxio Enterprise Data

Blog

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer