Alluxio - Blog

Speed up Largescale ML/DL Offline Inference Jobs with Alluxio at Microsoft Bing

Machine Learning Model Training with Alluxio: Part 1 - Solution Overview

In this blog, we provide an overview of Alluxio's AI/ML model training solution. For more details about the reference architecture and benchmarking results, please refer to the full length whitepaper.

Top Data Predictions for 2022

As more organizations advance their data revolution strategy, and run more diverse workloads on a wider variety of platforms across clouds and hybrid clouds, 2022 will see even more advances in AI, machine learning and analytic workloads and technologies and services to support them.

Metadata Synchronization in Alluxio Design Implementation and Optimization

Metadata synchronization (sync) is a core feature in Alluxio that keeps files and directories consistent with their source of truth in under storage systems, thus making it simple for users to reason the data retrieved from Alluxio. Meanwhile, understanding the internal process is important in order to tune the performance. This article describes the design and the implementation in Alluxio to keep metadata synchronized.

Whats New in Alluxio 2.7: Enhanced Scalability Stability and Major Improvements in AIML Training Efficiency

Presto with Alluxio Overview Architecture Evolution for Interactive Queries

Alluxio is the data orchestration platform to unify data silos across heterogeneous environments. The following blog will discuss the architecture combining Spark with Alluxio.

Speeding Up the Atlas Supercomputing Platform with Fluid Alluxio

Unisound is an artificial intelligence company focusing on Internet of Things services. Unisound’s AI technology stacks include the perception and expression capabilities of signals, voices, images, and texts, and the cognitive technologies such as knowledge, understanding, analysis, and decision-making, towards a multi-modal AI system. Atlas is the supercomputing platform supporting all kinds of AI applications including model training and reasoning inferencing.

Alluxio Use Cases Overview: Unify silos with Data Orchestration

This blog is the first in a series introducing Alluxio as the data platform to unify data silos across heterogeneous environments. The next blog will include insights from PrestoDB committer Beinan Wang to uncover the value for analytics use cases, specifically with PrestoDB as the compute engine.

What's New in Alluxio 2.6: Better Performance for AIML Workloads plus Increased Operating Metrics Visibility

Alluxio 2.6 significantly improves the performance of data-intensive AI/ML workloads across any storage, and also improves the general maintainability and visibility of Alluxio clusters, especially for large-scale deployments. We have taken the feedback and contributions from the community and introduced features which simplify deployment, introduce new data management capabilities, optimize performance, and provide enhanced visibility into system behavior.

Whats new in Alluxio 2.5

Alluxio 2.5 focuses on improving interface support to broaden the set of data driven applications which can benefit from data orchestration. The POSIX and S3 client interfaces have greatly improved in performance and functionality as a result of the widespread usage and demand from AI/ML workloads and system administration needs. Alluxio is rapidly evolving to meet the needs of enterprises that are deploying it as a key component of their AI/ML stacks.

Accelerating Analytics and AI with Alluxio and NVIDIA GPUs

Data processing is increasingly making use of NVIDIA computing for massive parallelism. Advancements in accelerated compute mean that access to storage must also be quicker, whether in analytics, artificial intelligence (AI), or machine learning (ML) pipelines.

Bursting Your On-Premises Data Lake Analytics and AI Workloads on AWS

This post outlines a solution for building a hybrid data lake with Alluxio to leverage analytics and AI on Amazon Web Services (AWS) alongside a multi-petabyte on-premises data lake. Alluxio’s solution is called “zero-copy” hybrid cloud, indicating a cloud migration approach without first copying data to Amazon Simple Storage Service (Amazon S3).

Blog

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer