Blog

Alluxio Blog

AI/ML Infra Meetup – Highlights & Key Takeaways

Co-hosted by Alluxio and Uber on May 23, 2024, AI/ML Infra Meetup was the community event for developers focused on building AI, ML and data infrastructure at scale. We were thrilled by the overwhelming interest and enthusiasm in our meetup! This event brought together over 100 AI/ML infrastructure engineers and enthusiasts to discuss the latest … Continued

Trino and Alluxio, Better Together

This blog post delves into the history behind Trino introducing Alluxio as a replacement for RubiX as a file system cache. It explores the synergy between Trino and Alluxio, assesses which type of cache best suits various needs, and shares real-world examples of Trino and Alluxio adoption. Trino is an open-source distributed SQL query engine … Continued

What’s New In Alluxio Enterprise AI 3.2: GPU Acceleration, Python Filesystem API, Write Checkpointing and More!

Performance, cache operability, and cost efficiency are key considerations for AI platform teams supporting large scale model training and distribution. In 2023, we launched Alluxio Enterprise AI, for managing AI training and model distribution I/O across diverse environments, whether in a single storage with diverse computing clusters or in a more complex multi-cloud, multi-data center … Continued

How Can AI Platforms Adapt to Hybrid or Multi-Cloud Environments?

This article was originally published on Spiceworks. https://www.spiceworks.com/tech/artificial-intelligence/guest-article/adapting-ai-platform-to-hybrid-cloud/ This blog discusses the challenges of implementing AI platforms in hybrid and multi-cloud environments and shares examples of organizations that have prioritized security and optimized cost management using the data access layer. In recent years, AI platforms have undergone significant transformations as GenAI and AI continue to … Continued

Maximize GPU Utilization for Model Training

GPU utilization or GPU usage, is the percentage of GPUs’ processing power being used at a particular time. As GPUs are expensive resources, optimizing their utilization and reducing idle time is essential for enterprise AI infrastructure. This blog explores bottlenecks hindering GPU utilization during model training and provides solutions to maximize GPU utilization. 1. Why … Continued

IWD 2024: Empower Women Developers in the Open-Source Community

This article was originally published on ITBrief. The author is Hope Wang, Developer Advocate, Alluxio. As we celebrate International Women’s Day, it is important to reflect on the progress we have made toward gender equality in the tech industry, particularly in open-source software (OSS). While there is still much work to be done, I am … Continued

Accelerating Data Loading in Large-Scale ML Training With Ray and Alluxio

In the rapidly-evolving field of artificial intelligence (AI) and machine learning (ML), the efficient handling of large datasets during training is becoming more and more pivotal. Ray has emerged as a key player, enabling large-scale dataset training through effective data streaming. By breaking down large datasets into manageable chunks and dividing training jobs into smaller … Continued

The Best Content of 2023 – Our Favorite Things

2023 is over, so we’ve compiled a collection of 2023’s most popular content according to our readers. In case you missed anything, here’s your chance to catch up on best practices ebooks, technical blogs, hands-on videos, webinars and more. Enjoy! ALL THINGS AI Building High-performance Data Access Layer for Model Training and Model Serving for … Continued

Setting the Stage for Alluxio Community to Soar in the Year of the Dragon: 2023 Recap and 2024 Outlook

As we step into 2024, we look back and celebrate an incredible year of 2023 for the Alluxio community. First and foremost, thank you to all of our contributors and the broader community! Together, we have achieved remarkable milestones. 💖 📈 Highlights by Numbers Let’s take a look at the Alluxio in 2023 by numbers. … Continued