Developer and Engineering Archives

Six Tips To Optimize PyTorch for Faster Model Training

August 1, 2024 By Hope Wang

Originally published at The New Stack: https://thenewstack.io/this-is-how-to-optimize-pytorch-for-faster-model-training/ PyTorch is one of the most popular deep learning frameworks in production today. As models become increasingly complex and dataset sizes grow, optimizing model training performance becomes crucial to reduce training times and improve productivity. In this article, I’ll share the latest performance tuning tips to accelerate the training … Continued

Accelerating Data Loading in Large-Scale ML Training With Ray and Alluxio

January 23, 2024 By Lu Qiu, Chunxu Tang and Beinan Wang

In the rapidly-evolving field of artificial intelligence (AI) and machine learning (ML), the efficient handling of large datasets during training is becoming more and more pivotal. Ray has emerged as a key player, enabling large-scale dataset training through effective data streaming. By breaking down large datasets into manageable chunks and dividing training jobs into smaller … Continued

Setting the Stage for Alluxio Community to Soar in the Year of the Dragon: 2023 Recap and 2024 Outlook

January 9, 2024 By Hope Wang, Chanchan Mao, Bin Fan, Shouwei Chen, Tango Tian, Tianyu Wang, Shun Lv and Allan Sha

As we step into 2024, we look back and celebrate an incredible year of 2023 for the Alluxio community. First and foremost, thank you to all of our contributors and the broader community! Together, we have achieved remarkable milestones. 💖 📈 Highlights by Numbers Let’s take a look at the Alluxio in 2023 by numbers. … Continued

A Journey Towards Data Locality on Cloud for Machine Learning and AI

December 18, 2023 By Lu Qiu and Shawn Sun

In this blog, we discuss the importance of data locality for efficient machine learning on the cloud. We examine the pros and cons of existing solutions and the tradeoff between reducing costs and maximizing performance through data locality. We then highlight the new-generation Alluxio design and implementation, detailing how it brings value to model training … Continued

Consistent Hashing in Alluxio DORA

October 31, 2023 By Jiaming Mai

Consistent hashing is a special technique that allows hash rings to be expanded or shrunk dynamically with minimal disruption. Alluxio’s DORA (Decentralized Object Repository Architecture) uses consistent hashing for load balancing when scaling nodes. To reach the goal of fast performance, strict consistency, and load balancing, we analyze, evaluate, and select the most suitable consistent … Continued

Introducing DORA: The Next-generation Alluxio Architecture

October 18, 2023 By Beinan Wang, Bin Fan, Bowen Ding, Jiaming Mai, Hua Huang, Lu Qiu, Jianjian Xie, Shawn Sun, Lucy Ge, Chunxu Tang, Kai Zhang and Hope Wang

Today, we are thrilled to launch the Alluxio Enterprise AI product. One of the key innovations is the introduction of the next-generation architecture DORA – a Decentralized Object Repository Architecture. This blog talks about our development of the DORA architecture, including our motivation, design decisions, and implementation. 1. Moving from Data Analytics to the AI … Continued

A Deep Dive into Caching in Presto

October 11, 2023 By Hope Wang and Beinan Wang

This article was initially posted on InfoWorld. Understand the caching mechanisms for the popular distributed SQL engine and how to use them to improve query speed and efficiency. Presto is a popular, open source, distributed SQL engine that enables organizations to run interactive analytic queries on multiple data sources at a large scale. Caching is a typical optimization … Continued

A Deep Dive into the Call Chain Relationship Between Presto, Hive, and Alluxio

September 11, 2023 By Jiaming Mai

Alluxio is commonly used with Presto and Hive to accelerate queries. Understanding how Presto+Hive+Alluxio work together and the flow from SQL query to low-level file system operations is key to tuning performance. This post will dive into the relationship between Presto, Hive, and Alluxio. We will walk you through how a SQL query executes in … Continued

Data Caching Strategies for Data Analytics and AI: Data+AI Summit 2023 Session Recap

July 13, 2023 By Chunxu Tang, Beinan Wang and Hope Wang

Data caching is essential to the modern data stack, allowing organizations to access data quickly and efficiently for analytics and AI. On June 28, 2023, we presented Data Caching Strategies for Data Analytics and AI at Data+AI Summit 2023. We are excited to bring you a recap of that presentation through this blog post. We … Continued

Category: Developer and Engineering