Blog

Alluxio's strong Q2 featured Enterprise AI 3.7 launch with sub-millisecond latency (45× faster than S3 Standard), 50%+ customer growth including Salesforce and Geely, and MLPerf Storage v2.0 results showing 99%+ GPU utilization, positioning the company as a leader in maximizing AI infrastructure ROI.

In this blog, Greg Lindstrom, Vice President of ML Trading at Blackout Power Trading, an electricity trading firm in North American power markets, shares how they leverage Alluxio to power their offline feature store. This approach delivers multi-join query performance in the double-digit millisecond range, while maintaining the cost and durability benefits of Amazon S3 for persistent storage. As a result, they achieved a 22 to 37x reduction in large-join query latency for training and a 37 to 83x reduction in large-join query latency for inference.
.png)
.jpeg)
This blog post delves into the history behind Trino introducing Alluxio as a replacement for RubiX as a file system cache. It explores the synergy between Trino and Alluxio, assesses which type of cache best suits various needs, and shares real-world examples of Trino and Alluxio adoption.
.jpeg)
Performance, cache operability, and cost efficiency are key considerations for AI platform teams supporting large scale model training and distribution. In 2023, we launched Alluxio Enterprise AI, for managing AI training and model distribution I/O across diverse environments, whether in a single storage with diverse computing clusters or in a more complex multi-cloud, multi-data center environment.
.jpeg)
This article was originally published on Spiceworks. https://www.spiceworks.com/tech/artificial-intelligence/guest-article/adapting-ai-platform-to-hybrid-cloud/
This blog discusses the challenges of implementing AI platforms in hybrid and multi-cloud environments and shares examples of organizations that have prioritized security and optimized cost management using the data access layer.
.jpeg)
GPU utilization or GPU usage, is the percentage of GPUs’ processing power being used at a particular time. As GPUs are expensive resources, optimizing their utilization and reducing idle time is essential for enterprise AI infrastructure. This blog explores bottlenecks hindering GPU utilization during model training and provides solutions to maximize GPU utilization.
.jpeg)
This article was originally published on ITBrief. The author is Hope Wang, Developer Advocate, Alluxio.
As we celebrate International Women's Day, it is important to reflect on the progress we have made toward gender equality in the tech industry, particularly in open-source software (OSS). While there is still much work to be done, I am proud to be part of a community actively working to empower women and promote diversity. In this article, I want to share my path to the open-source community and offer advice to women developers interested in contributing to open-source projects.
.jpeg)
In the rapidly-evolving field of artificial intelligence (AI) and machine learning (ML), the efficient handling of large datasets during training is becoming more and more pivotal. Ray has emerged as a key player, enabling large-scale dataset training through effective data streaming. By breaking down large datasets into manageable chunks and dividing training jobs into smaller tasks, Ray circumvents the need for local storage of the entire dataset on each training machine. However, this innovative approach is not without its challenges.

2023 is over, so we’ve compiled a collection of 2023’s most popular content according to our readers. In case you missed anything, here’s your chance to catch up on best practices ebooks, technical blogs, hands-on videos, webinars and more.
Enjoy!
.jpeg)
As we step into 2024, we look back and celebrate an incredible year of 2023 for the Alluxio community.
First and foremost, thank you to all of our contributors and the broader community! Together, we have achieved remarkable milestones. 💖
.jpeg)
In this blog, we discuss the importance of data locality for efficient machine learning on the cloud. We examine the pros and cons of existing solutions and the tradeoff between reducing costs and maximizing performance through data locality. We then highlight the new-generation Alluxio design and implementation, detailing how it brings value to model training and deployment. Finally, we share lessons learned from benchmarks and real-world case studies.

This article was initially posted on datanami.
The paradigm shift ushered in by Artificial Intelligence (AI) in today’s business and technological landscapes is nothing short of revolutionary. AI’s potential to transform traditional business models, optimize operations, and catalyze innovation is vast. But navigating its complexities can be daunting. Organizations must understand and adhere to some foundational principles to ensure AI initiatives lead to sustainable success. Let’s delve deeper into these ten evergreen principles:
.jpeg)
In this blog, we discuss the data access challenges in AI and why commonly used NAS/NFS may not be a good option for your organization.
.jpeg)
Alluxio, the data platform company for all data-driven workloads, hosted the community event “AI Infra Day” on October 25, 2023. This virtual event brought together technology leaders working on AI infrastructure from Uber, Meta, and Intel, to delve into the intricate aspects of building scalable, performant, and cost-effective AI platforms.