data orchestration Archives | Page 2 of 16

From Cache to Cash: Introducing NFT for Data Orchestration

April 1, 2022 By Bin Fan and Hope Wang

Today, we are excited to announce the launch of Non-fungible token (NFT) as a new feature in our leading data orchestration platform.

Alluxio Day 10

Community Virtual Event * March 3, 2022

Join us for the 10th Alluxio Day virtual community event featuring speakers from Uber, BiliBili, and Alluxio.

Alluxio and Apache Ranger Best Practices

February 2, 2022 By Greg Palmer

As data stewards and security teams provide broader access to their organization’s data lake environments, having a centralized way to manage fine-grained access policies becomes increasingly important. Alluxio can use Apache Ranger’s centralized access policies in two ways: 1) directly controlling access to virtual paths in the Alluxio virtual file system or 2) enforcing existing access policies for the HDFS under stores.

Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds

January 27, 2022

Data platform teams are increasingly challenged with accessing multiple data stores that are separated from compute engines, such as Spark, Presto, TensorFlow or PyTorch. Whether your data is distributed across multiple datacenters and/or clouds, a successful heterogeneous data platform requires efficient data access. Alluxio enables you to embrace the separation of storage from compute and use Alluxio data orchestration to simplify adoption of the data lake and data mesh paradigms for analytics and AI/ML workloads.

Tags: ai, analytics, cloud, compute, data orchestration, data platform, data stores, ml, storage

A Year with Alluxio Community 2021

January 20, 2022 By Bin Fan and Jasmine Wang

2021 marked accelerated growth for the Alluxio Open Source Project. We could not be more grateful for what the community has achieved together in this past year. This blog provides a glimpse of the year long summary of our community growth.

Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds

Alluxio Product School * January 27, 2022

Whether your data is distributed across multiple datacenters and/or clouds, a successful heterogeneous data platform requires efficient data access. Alluxio enables you to embrace the separation of storage from compute and use Alluxio data orchestration to simplify adoption of the data lake and data mesh paradigms for analytics and AI/ML workloads.

Accelerating Machine Learning / Deep Learning in the Cloud: Architecture and Benchmark

December 7, 2021

This whitepaper introduces how to speed up end-to-end distributed training in the cloud using Alluxio to accelerate data access. With the help of Alluxio, loading data from cloud storage, training and caching data can be done in a transparent and distributed way as a part of the training process. This whitepaper also demonstrates how to set up and benchmark the end-to-end performance of the training process, along with a comparison of other popular approaches.

Tags: benchmark, cache, cloud, data orchestration, deep learning, distributed training, machine learning, performance, storage

Tag: data orchestration