Resources

Blog

Blog

Building A High Performance Data Access Layer for Model Training and Model Distribution for LLM at Zhihu

Bringing a large language model from its initial training to deployment requires numerous systems and components. At Zhihu, we grappled with a multi-cloud, cross-region AI platform, requiring an efficient solution to facilitate the rapid training and delivery of models for production use cases. This led us to adopt Alluxio, the high-performance data access layer for LLM. This blog provides an in-depth look at Zhihu’s challenges, journey, and solution for LLM training and deployment. Through adopting Alluxio, we’ve significantly enhanced model training performance by 2 to 3 times and can deploy updated models every minute instead of hours or days. Also, our GPU utilization has doubled, infrastructure and operation costs have been halved, and we have established a resilient, efficient infrastructure capable of meeting our escalating AI demands.

Blog

Blog

Millions Saved Annually Unleashing the Power of Alluxio HDFS at Uber

On Demand Videos

On Demand Videos

Alluxio Product School Webinar – Distributed Caching for Generative AI: Optimizing LLM Data Pipeline

‍

Ebook

Ebook

The Presto Optimization Handbook

Best Practices and Tuning Tips (with SQL codes, configuration settings, session properties, examples, and real-world case studies!)

On Demand Videos

On Demand Videos

Alluxio Product School Webinar – Hands-on Lab: Get Started with Alluxio on Kubernetes

Ebook

Ebook

The Trino Optimization Handbook

Best Practices and Tuning Tips (with SQL codes, configuration settings, session properties, examples, and real-world case studies!)

Blog

Blog

Saving Cloud Costs in 2023 Top Five Strategies to Reduce AWS Cloud Data Transfer Fees

Blog

Blog

Announcing Our First AI PMC Member CacheGPT

On Demand Videos

On Demand Videos

Alluxio Product School Webinar – Boosting Trino Performance: Expert Tips for Tuning and Optimization

Case Study

Case Study

Shopee

Query Acceleration & Data Access as a Service

Ebook

Ebook

The Ultimate Guide to Saving Data Egress Costs in the Cloud

Best Practices for Cloud Cost Optimization

Blog

Blog

Alipay Optimizing Alluxio for Efficient LargeScale Training on Billions of Files

On Demand Videos

On Demand Videos

Alluxio Product School Webinar – Transparent URI

Blog

Blog

Cross Cluster Synchronization in Alluxio Part 1 Scenarios and Background

This is a blog series talking about the design and implementation of the Cross Cluster Synchronization mechanism in Alluxio. This mechanism ensures that the metadata is consistent when running multiple Alluxio clusters. Part 1 of this blog series discusses the scenario and background.

Blog

Blog

Cross Cluster Synchronization in Alluxio Part 3 Discussions and Conclusion

Following part 1 and part 2, this final blog of the series discusses some design decisions and details, as well as certain future work.

Blog

Blog

Cross Cluster Synchronization in Alluxio Part 2 Mechanism

This is part 2 of the blog series talking about the design and implementation of the Cross Cluster Synchronization mechanism in Alluxio. In the previous blog, we discussed the scenario, background and how metadata sync is done with a single Alluxio cluster. This blog will describe how metadata sync is built upon to provide metadata consistency in a multi-cluster scenario.

‍