Blog

Alluxio Blog

Millions Saved Annually: Unleashing the Power of Alluxio + HDFS at Uber

In October 2022, Uber’s Presto team shared in a blog post using the Alluxio SDK cache to boost Presto query performance and cost efficiency. This achievement is a major milestone in the collaboration between Alluxio and Uber. Thus far, the Uber Presto team has implemented the Alluxio SDK cache in three production clusters spanning over … Continued

Announcing Our First AI 🤖 PMC Member: CacheGPT

We are thrilled to announce that CacheGPT, a state-of-the-art natural language generation model, has joined the Alluxio Project Management Committee (PMC) as our newest member!  CacheGPT has been an active contributor to Alluxio since the beginning of this year. It reviews pull requests and draft documentation using only emojis! See our new emoji-enriched documentation here! … Continued

Alipay: Optimizing Alluxio for Efficient Large-Scale Training on Billions of Files

Chuanying Chen, Senior Software Engineer at Ant Group, provides a deep dive into the practices of optimizing Alluxio for reliable, scalable, and high-performance large-scale training on billions of files. 1. Background Ant Group, formerly known as Ant Financial, is an affiliate company of the Chinese conglomerate Alibaba Group. The group owns the world’s largest mobile … Continued

“Data Access as a Service” at Shopee: Using Alluxio to Accelerate Interactive Queries and Enhance Developer Experience with Flexible APIs

Shopee is the leading e-commerce platform in Southeast Asia. In this blog, Tianbao Ding and Haoning Sun from Shopee’s data infrastructure team share their project on query acceleration and “Data Access as a Service.” They describe how Shopee leverages Alluxio to improve Trino query performance by ~55% and how Alluxio enhances developer experience by providing … Continued