Building High-performance Data Access Layer for Model Training and Model Serving for LLM

Bringing a large language model from its initial training to deployment requires numerous systems and components. At Zhihu, we grappled with a multi-cloud, cross-region AI platform, requiring an efficient solution to facilitate the rapid training and delivery of models for production use cases. This led us to adopt Alluxio, the high-performance data access layer for … Continued

Alipay: Optimizing Alluxio for Efficient Large-Scale Training on Billions of Files

Chuanying Chen, Senior Software Engineer at Ant Group, provides a deep dive into the practices of optimizing Alluxio for reliable, scalable, and high-performance large-scale training on billions of files. 1. Background Ant Group, formerly known as Ant Financial, is an affiliate company of the Chinese conglomerate Alibaba Group. The group owns the world’s largest mobile … Continued

“Data Access as a Service” at Shopee: Using Alluxio to Accelerate Interactive Queries and Enhance Developer Experience with Flexible APIs

Shopee is the leading e-commerce platform in Southeast Asia. In this blog, Tianbao Ding and Haoning Sun from Shopee’s data infrastructure team share their project on query acceleration and “Data Access as a Service.” They describe how Shopee leverages Alluxio to improve Trino query performance by ~55% and how Alluxio enhances developer experience by providing … Continued

How Trino and Alluxio Power Analytics at Razorpay

This blog was originally published in Razorpay Engineering Blog: https://engineering.razorpay.com/how-trino-and-alluxio-power-analytics-at-razorpay-803d3386daaf Razorpay is a large fintech company in India. Razorpay provides a payment solution that offers a fast, affordable, and secure way to accept and disburse payments online. On the engineering side, the availability and scalability of analytics infrastructure are crucial to providing seamless experiences to … Continued

Unifying Cross-region Access in the Cloud at Expedia Group — The Path Toward Data Mesh in the Brand World

This article shares the data platform practice at Expedia to federate cross-region data lakes spanning multiple geographic regions in the cloud. 1. Background Expedia Group (NASDAQ: EXPE) is an American online travel shopping company for consumer and small business travel. Expedia powers travel for everyone, everywhere through our global platform, with industry-leading technology solutions to … Continued

When AI Meets Alluxio at Bilibili | Building an Efficient AI Platform for Data Preprocessing and Model Training

Lei Li, AI Platform Lead, and Zifan Ni, Senior Software Engineer from Bilibili, share how they applied Alluxio to their AI platform to increase training efficiency, as well as best practices including technical architecture and specific tuning tips Overview About Bilibili Bilibili (NASDAQ: BILI) is a leading video community with a mission to enrich the … Continued