This talk provides an overview of the read-after-write data consistent mechanism in the Alluxio system.
This talk will introduce Apache Iceberg and its place in a modern and open data platform. It will cover the motivation for creating Iceberg at Netflix, as well as the data architecture that Iceberg makes possible.
WeRide provides an overview of Alluxio + Spark use case, which has been deployed and running in production to accelerate auto data tagging in the autonomous driving development.
This talk describes the design of shadow cache, a lightweight component to track the working set size of Alluxio cache. Shadow cache can keep track of the working set size over the past window dynamically, and is implemented by a series of bloom filters. We’ve deployed the shadow cache in Facebook Presto and leverage the result to understand the system bottleneck and help with routing design decisions.
This talk discusses the opportunities and problems when Uber meets Alluxio. Zhongting from Uber will provide an overview of Uber traffic, cloud, distribution, invalidation, and consistent hashing. Beinan from Alluxio will provide a deep dive of metadata and monitoring metrics.
deep dive into two important areas of active development going forward – table metadata management and caching.
This talk shares the designs and use cases of the Alluxio and Spark integrated solutions, as well as the best practice and “what not to do” in designing and implementing Alluxio distributed systems.
The Alluxio core engineering team re-designed things to come up with a more efficient and transparent way for users to leverage data orchestration through the POSIX interface. This enables much better performance for ML workloads where data is accessed via the POSIX interface.
ALLUXIO DAY V 2021 August 27, 2021 Speaker: