From Cache to Cash: Introducing NFT for Data Orchestration
Today, we are excited to announce the launch of Non-fungible token (NFT) as a new feature in our leading data orchestration platform.
Today, we are excited to announce the launch of Non-fungible token (NFT) as a new feature in our leading data orchestration platform.
With the collaboration between Meta (Facebook), Princeton University, and Alluxio, we have developed “Shadow Cache” – a lightweight Alluxio component to track the working set size and infinite cache hit ratio. Shadow cache can keep track of the working set size over the past window dynamically and is implemented by a series of bloom filters. Shadow cache is deployed in Meta (Facebook) Presto and is being leveraged to understand the system bottleneck and help with routing design decisions.
This whitepaper introduces how to speed up end-to-end distributed training in the cloud using Alluxio to accelerate data access. With the help of Alluxio, loading data from cloud storage, training and caching data can be done in a transparent and distributed way as a part of the training process. This whitepaper also demonstrates how to set up and benchmark the end-to-end performance of the training process, along with a comparison of other popular approaches.
Tags: benchmark, cache, cloud, data orchestration, deep learning, distributed training, machine learning, performance, storage
This talk describes the design of shadow cache, a lightweight component to track the working set size of Alluxio cache. Shadow cache can keep track of the working set size over the past window dynamically, and is implemented by a series of bloom filters. We’ve deployed the shadow cache in Facebook Presto and leverage the result to understand the system bottleneck and help with routing design decisions.
Tags: alluxio day, architecture, cache, facebook, presto, shadow cache
For many latency-sensitive SQL workloads, Presto is often bound by retrieving distant data. In this talk, Rohit Jain from Facebook will introduce their teams’ collaboration with Alluxio on adding a local on-SSD Alluxio cache inside Presto workers at Facebook to improve queries with unsatisfied latency.
Tags: cache, data orchestration, data orchestration summit, facebook, presto, sql workloads
In this blog, Derek Tan, Executive Director of Infra & Simulation at WeRide, describes how engineers leverage Alluxio as a hybrid cloud data gateway for applications on-premises to access public cloud storage like AWS S3.