Apache Hudi : The Path Forward
deep dive into two important areas of active development going forward – table metadata management and caching.
Tags: alluxio day, apache hudi, caching, data lake, metadata management
deep dive into two important areas of active development going forward – table metadata management and caching.
Tags: alluxio day, apache hudi, caching, data lake, metadata management
This talk shares the designs and use cases of the Alluxio and Spark integrated solutions, as well as the best practice and “what not to do” in designing and implementing Alluxio distributed systems.
Tags: alluxio day, big data, data orchestration, distributed systems, spark
Join us for our 6th Alluxio Day community virtual event featuring speakers from Join us for our 4th Alluxio Day community virtual event featuring speakers from Facebook, Princeton, Apache Hudi, Zendesk, and Uber.
ALLUXIO DAY V 2021 August 27, 2021 Speaker: Peijie Zhou
Tags: alluxio day, boss
ALLUXIO DAY V 2021 August 27, 2021 Speakers: Binyang Li (Software Engineer from Bing, focus on AI infrastructure) Qianxi Zhang (Research Software Engineer from MSRA, focus on next generation storage system)
Tags: alluxio day, microsoft
ALLUXIO DAY V 2021 August 27, 2021 Speaker: Lu Qiu has been involved in open source software for many years and is currently a software engineer at Alluxio. Lu develops easier ways for Alluxio integration in the public cloud environment. Lu is mainly responsible for leader election, journal management, metrics management, and big data preparation for … Continued
Tags: alluxio day, data orchestration
ALLUXIO DAY V 2021 August 27, 2021 Speaker: Yunlong Kong
Tags: alluxio day, momo
Alluxio has an excellent metrics system and supports various kinds of metrics, e.g. an embedded JSON sink and the prometheus sink. Users and developers can easily create a custom sink of Alluxio by implementing the Sink interface.
Tags: alluxio day, grafana, metrics, prometheus, tencent
Nowadays it is not straightforward to integrate Alluxio with popular query engines like Presto on existing Hive data. Solutions proposed by the community like Alluxio Catalog Service or Transparent URI brings unnecessary pressure on Alluxio masters when querying files should not be cached.
Tags: alluxio day, cache layer, hive, presto, tiktok