Building a Distributed File System For The Cloud-Native Era

November 7, 2022

Bin Fan

VP of Technology

Alluxio

Big Data Bellevue Meetup

May 19, 2022

Today, data engineering in modern enterprises has become increasingly more complex and resource-consuming, particularly because (1) the rich amount of organizational data is often distributed across data centers, cloud regions, or even cloud providers, and (2) the complexity of the big data stack has been quickly increasing over the past few years with an explosion in big-data analytics and machine-learning engines (like MapReduce, Hive, Spark, Presto, Tensorflow, PyTorch to name a few).

To address these challenges, it is critical to provide a single and logical namespace to federate different storage services, on-prem or cloud-native, to abstract away the data heterogeneity, while providing data locality to improve the computation performance. [Bin Fan] will share his observation and lessons learned in designing, architecting, and implementing such a system – Alluxio open-source project — since 2015.

Alluxio originated from UC Berkeley AMPLab (used to be called Tachyon) and was initially proposed as a daemon service to enable Spark to share RDDs across jobs for performance and fault tolerance. Today, it has become a general-purpose, high-performance, and highly available distributed file system to provide generic data service to abstract away complexity in data and I/O. Many companies and organizations today like Uber, Meta, Tencent, Tiktok, Shopee are using Alluxio in production, as a building block in their data platform to create a data abstraction and access layer. We will talk about the journey of this open source project, especially in its design challenges in tiered metadata storage (based on RocksDB), embedded state-replicate machine (based on RAFT) for HA, and evolution in RPC framework (based on gRPC) and etc.

Meetup Group

Big Data Bellevue: https://www.meetup.com/big-data-bellevue-bdb/

Big Data Bellevue Meetup

May 19, 2022

Meetup Group

Big Data Bellevue: https://www.meetup.com/big-data-bellevue-bdb/

Big Data Bellevue Meetup

May 19, 2022

Meetup Group

Big Data Bellevue: https://www.meetup.com/big-data-bellevue-bdb/

Video:

Presentation Slides:

Building a Distributed File System for the Cloud-Native Era from Alluxio, Inc.

‍

Videos:

Presentation Slides:

Building a Distributed File System For The Cloud-Native Era from Alluxio, Inc.

Big Data Bellevue Meetup

May 19, 2022

Meetup Group

Big Data Bellevue: https://www.meetup.com/big-data-bellevue-bdb/

Video:

Presentation Slides:

Building a Distributed File System for the Cloud-Native Era from Alluxio, Inc.

‍

Videos:

Presentation Slides:

Building a Distributed File System For The Cloud-Native Era from Alluxio, Inc.

Complete the form below to access the full overview:

Videos

Optimizing Tiered Storage for Low-Latency Real-Time Analytics at AI Scale

Real-time OLAP databases are optimized for speed and often rely on tightly coupled storage-compute architectures using disks or SSDs. Decoupled architectures, which use cloud object storage, introduce an unavoidable tradeoff: cost efficiency at the expense of performance. This makes them unsuitable for databases that need to provide low-latency, real-time analytics, especially the new wave of LLM-powered dashboards, retrieval-augmented generation (RAG), and vector-embedding searches that thrive only when fresh data is milliseconds away. Can we achieve both cost efficiency and performance?

In this talk, we’ll explore the engineering challenges of extending Apache Pinot—a real-time OLAP system—onto cloud object storage while still maintaining sub-second P99 latencies.

We’ll dive into how we built an abstraction in Apache Pinot to make it agnostic to the location of data. We’ll explain how we can query data directly from the cloud (without needing to download the entire dataset, as with lazy-loading) while achieving sub-second latencies. We’ll cover the data fetch and optimization strategies we implemented, such as pipelining fetch and compute, prefetching, selective block fetches, index pinning, and more. We'll also share our latest work about integration with open table formats like iceberg, and how we will continue to achieve fast analytics directly on parquet files by implementing all the same techniques that apply to tiered storage.

‍

July 15, 2025

Introduction to Apache Iceberg™ & Tableflow

The data lake is a fantastic, low-cost place to put data at rest for offline analytics, but we've built it under the terms of a terrible bargain: all that cheap storage at scale was a great thing, but we gave up schema management and transactions along the way. Apache Iceberg has emerged as king of the Open Table Formats to fix this very problem.

Built on the foundation of Parquet files, Iceberg adds a simple yet flexible metadata layer and integration with standard data catalogs to provide robust schema support and ACID transactions to the once ungoverned data lake. In this talk, we'll build Iceberg up from the basics, see how the read and write path work, and explore how it supports streaming data sources like Apache Kafka™. Then we'll see how Confluent's Tableflow brings Kafka together with open table formats like Iceberg and Delta Lake to make operational data in Kafka topics instantly visible to the data lake without the usual ETL—unifying the operational/analytical divide that has been with us for decades.

July 15, 2025

Meet in the Middle: Solving the Low-Latency Challenge for Agentic AI

Storing data as Parquet files on S3 is increasingly used not just as a data lake but also as a lightweight feature store for ML training/inference or a document store for RAG. However, querying petabyte- to exabyte-scale data lakes directly from cloud object storage remains notoriously slow (e.g., latencies ranging from hundreds of milliseconds to several seconds on AWS S3).

In this talk, we show how architecture co-design, system-level optimizations, and workload-aware engineering can deliver over 1000× performance improvements for these workloads—without changing file formats, rewriting data paths, or provisioning expensive hardware.

We introduce a high-performance, low-latency S3 proxy layer powered by Alluxio, deployed atop hyperscale data lakes. This proxy delivers sub-millisecond Time-to-First-Byte (TTFB)—on par with Amazon S3 Express—while preserving compatibility with standard S3 APIs. In real-world benchmarks, a 50-node Alluxio cluster sustains over 1 million S3 queries per second, offering 50× the throughput of S3 Express for a single account, with no compromise in latency.

Beyond accelerating access to Parquet files byte-to-byte, we also offload partial Parquet processing from query engines via a pluggable interface into Alluxio. This eliminates the need for costly index scans and file parsing, enabling point queries with 0.3 microseconds latency and up to 3,000 QPS per instance (measured using a single-thread)—a 100× improvement over traditional query paths.

July 15, 2025

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo

Alluxio Enterprise AI

Alluxio Enterprise Data

Big Data Bellevue Meetup

Big Data Bellevue Meetup

Big Data Bellevue Meetup

Videos:

Presentation Slides:

Big Data Bellevue Meetup

Videos:

Presentation Slides:

Complete the form below to access the full overview:

Videos

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer