kubernetes Archives

A Journey Towards Data Locality on Cloud for Machine Learning and AI

December 18, 2023 By Lu Qiu and Shawn Sun

In this blog, we discuss the importance of data locality for efficient machine learning on the cloud. We examine the pros and cons of existing solutions and the tradeoff between reducing costs and maximizing performance through data locality. We then highlight the new-generation Alluxio design and implementation, detailing how it brings value to model training … Continued

Alluxio Kubernetes Operator Tutorial: Simplifying Deploying and Managing Alluxio Clusters

August 14, 2023 By Shawn Sun, Beinan Wang and Hope Wang

This blog provides a tutorial on using the Kubernetes operator to simplify deploying and managing Alluxio clusters on Kubernetes. Introduction The Alluxio Kubernetes operator makes deploying and managing Alluxio and the datasets on Kubernetes easier. With the operator, Alluxio clusters can be deployed and managed seamlessly like any other native Kubernetes application. The operator handles … Continued

Alluxio Product School Webinar – Hands-on Lab: Get Started with Alluxio on Kubernetes

April 25, 2023

Shawn Sun, Alluxio’s software engineer, shares how to get started with Alluxio on Kubernetes in April’s Product School Webinar. To simplify the DevOps of the stack of Alluxio with a query engine, Alluxio has provided two ways to deploy on Kubernetes, helm and operator. They significantly simplify the deployment, configuration, and life cycle management of … Continued

Tags: data, k8s, kubernetes, storage

What’s Next for Data Analytics, AI, and Cloud in 2023?

December 27, 2022 By Bin Fan

Originally published on vmblog.com: https://vmblog.com/archive/2022/12/27/alluxio-2023-predictions-what-s-next-for-data-analytics-ai-and-cloud-in-2023.aspx As we enter 2023, the world of analytics, AI, and cloud is entering an exciting new phase, with a wide range of innovations and developments set to reshape the landscape. Below are some trends that will have the most impact in the coming year. Trend 1: Cloud cost optimization is … Continued

Alluxio on Kubernetes – Powering training through Container Storage Interface plugin

April 28, 2022

Shawn Sun from Alluxio will present the journey of using Alluxio as the storage system for Kubernetes through Container Storage Interface (CSI) plugin and Alluxio CSI driver. This talk will cover the challenges we are facing with traditional setup in the AI/ML training jobs, and how Alluxio CSI driver manages to address them. It will also talk about a recent change to the driver that made it more sturdy and robust.

Tags: ai, alluxio day, CSI driver, kubernetes, ml, storage

Accelerate Auto Data Tagging with Alluxio and Spark in Hybrid Cloud – A Practice in WeRide

March 14, 2022 By Feifei Cai and Hao Zhu

This blog shares the practice of using Alluxio and Spark to accelerate the auto data tagging system in WeRide, an autonomous driving technology company.

Thousand-Node Alluxio Cluster Powers Game AI Platform – A Production Case Study from Tencent

January 26, 2022 By Bing Zheng, Baolong Mao and Zhizheng Pan

To provide model training with the best experience, Tencent has implemented a 1000-node Alluxio cluster and designed a scalable, robust, and performant architecture to speed up Ceph storage for game AI training. This blog will give you insight into how Alluxio has been implemented and optimized at Tencent.

What’s New in Alluxio 2.7: Enhanced Scalability, Stability and Major Improvements in AI/ML Training Efficiency

November 16, 2021 By Adit Madan and Hope Wang

With this release, Alluxio has strengthened its position as a de-facto data unification and acceleration solution in data analytics and machine learning pipelines. The solution is optimized to support Spark, Presto, Tensorflow, and PyTorch, and is available on multiple cloud platforms such as AWS, GCP, and Azure Cloud, and also on Kubernetes in private data centers or public clouds.

Speeding Up the Atlas Supercomputing Platform with Fluid + Alluxio

November 8, 2021 By Dongdong Lv and Qingsong Liu

Unisound is an artificial intelligence company focusing on Internet of Things services. Unisound’s AI technology stacks include the perception and expression capabilities of signals, voices, images, and texts, and the cognitive technologies such as knowledge, understanding, analysis, and decision-making, towards a multi-modal AI system. Atlas is the supercomputing platform supporting all kinds of AI applications including model training and reasoning inferencing.

Tag: kubernetes