Top Tips and Tricks for PyTorch Model Training Performance Tuning [2023]

Get the latest and greatest tips to accelerate your PyTorch model training for machine learning and deep learning. PyTorch, an open-source machine learning framework, has become the de facto choice for many organizations to develop and deploy deep learning models. Model training is the most compute-intensive phase of the machine learning pipeline. It requires continuous … Continued

Trino Optimization With Distributed Caching on Data Lakes: Trino Fest 2023 Session Recap

Originally published on trino.io: https://trino.io/blog/2023/07/21/trino-fest-2023-alluxio-recap.html By 2025, there will be 100 zetabytes stored in the cloud. That’s 100,000,000,000,000,000,000,000 bytes – a huge, eye-popping number. But only about 10% of that data is actually used on a regular basis. At Uber, for example, only 1% of their disk space is used for 50% of the data they access … Continued

Millions Saved Annually: Unleashing the Power of Alluxio + HDFS at Uber

In October 2022, Uber’s Presto team shared in a blog post using the Alluxio SDK cache to boost Presto query performance and cost efficiency. This achievement is a major milestone in the collaboration between Alluxio and Uber. Thus far, the Uber Presto team has implemented the Alluxio SDK cache in three production clusters spanning over … Continued

Announcing Our First AI 🤖 PMC Member: CacheGPT

We are thrilled to announce that CacheGPT, a state-of-the-art natural language generation model, has joined the Alluxio Project Management Committee (PMC) as our newest member!  CacheGPT has been an active contributor to Alluxio since the beginning of this year. It reviews pull requests and draft documentation using only emojis! See our new emoji-enriched documentation here! … Continued