machine learning Archives

Accelerating Data Loading in Large-Scale ML Training With Ray and Alluxio

January 23, 2024 By Lu Qiu, Chunxu Tang and Beinan Wang

In the rapidly-evolving field of artificial intelligence (AI) and machine learning (ML), the efficient handling of large datasets during training is becoming more and more pivotal. Ray has emerged as a key player, enabling large-scale dataset training through effective data streaming. By breaking down large datasets into manageable chunks and dividing training jobs into smaller … Continued

Top Tips and Tricks for PyTorch Model Training Performance Tuning [2023]

July 22, 2023 By Hope Wang, Beinan Wang and Chunxu Tang

Get the latest and greatest tips to accelerate your PyTorch model training for machine learning and deep learning. PyTorch, an open-source machine learning framework, has become the de facto choice for many organizations to develop and deploy deep learning models. Model training is the most compute-intensive phase of the machine learning pipeline. It requires continuous … Continued

Building High-performance Data Access Layer for Model Training and Model Serving for LLM

June 14, 2023 By Mengyu Hu (Zhihu) and Chengkun Jia (Zhihu)

Bringing a large language model from its initial training to deployment requires numerous systems and components. At Zhihu, we grappled with a multi-cloud, cross-region AI platform, requiring an efficient solution to facilitate the rapid training and delivery of models for production use cases. This led us to adopt Alluxio, the high-performance data access layer for … Continued

ML-Based SQL Query Resource Usage Prediction

September 15, 2022

With the advent of the Big Data era, it is usually computationally expensive to calculate the resource usages of a SQL query. Can we estimate the resource usages of SQL queries more efficiently without any computation in a SQL engine kernel? In this session, Chunxu and Beinan would like to introduce how Twitter’s data platform leverages a machine learning-based approach in Presto and BigQuery to estimate query utilization with 90%+ accuracy.

Tags: alluxio day, big data, machine learning, presto, sql, twitter

Deconstructing a Machine Learning Pipeline with Virtual Data Lake

Alluxio Product School * August 25, 2022

As more and more companies turn to AI / ML / DL to unlock insight, AI has become this mythical word that adds unnecessary barriers to new adaptors. Oftentimes it was regarded as luxury for those big tech companies only – this should not be the case.

Recommendations to Level Up Your Machine Learning Platform

April 12, 2022 By Bin Fan

With machine learning (ML) and artificial intelligence (AI) applications becoming more business-critical, organizations are in the race to advance their AI/ML capabilities. To realize the full potential of AI/ML, having the right underlying machine learning platform is a prerequisite.

Orchestrating Data for Machine Learning Pipelines

April 8, 2022 By Bin Fan

This article will discuss a new solution to orchestrating data for end-to-end machine learning pipelines that addresses the above questions. I will outline common challenges and pitfalls, followed by proposing a new technique, data orchestration, to optimize the data pipeline for machine learning.

A Year with Alluxio Community 2021

January 20, 2022 By Bin Fan and Jasmine Wang

2021 marked accelerated growth for the Alluxio Open Source Project. We could not be more grateful for what the community has achieved together in this past year. This blog provides a glimpse of the year long summary of our community growth.

Tag: machine learning