Products
Introduction to Apache Iceberg™ & Tableflow
July 15, 2025

The data lake is a fantastic, low-cost place to put data at rest for offline analytics, but we've built it under the terms of a terrible bargain: all that cheap storage at scale was a great thing, but we gave up schema management and transactions along the way. Apache Iceberg has emerged as king of the Open Table Formats to fix this very problem.
Built on the foundation of Parquet files, Iceberg adds a simple yet flexible metadata layer and integration with standard data catalogs to provide robust schema support and ACID transactions to the once ungoverned data lake. In this talk, we'll build Iceberg up from the basics, see how the read and write path work, and explore how it supports streaming data sources like Apache Kafka™. Then we'll see how Confluent's Tableflow brings Kafka together with open table formats like Iceberg and Delta Lake to make operational data in Kafka topics instantly visible to the data lake without the usual ETL—unifying the operational/analytical divide that has been with us for decades.

The data lake is a fantastic, low-cost place to put data at rest for offline analytics, but we've built it under the terms of a terrible bargain: all that cheap storage at scale was a great thing, but we gave up schema management and transactions along the way. Apache Iceberg has emerged as king of the Open Table Formats to fix this very problem.
Built on the foundation of Parquet files, Iceberg adds a simple yet flexible metadata layer and integration with standard data catalogs to provide robust schema support and ACID transactions to the once ungoverned data lake. In this talk, we'll build Iceberg up from the basics, see how the read and write path work, and explore how it supports streaming data sources like Apache Kafka™. Then we'll see how Confluent's Tableflow brings Kafka together with open table formats like Iceberg and Delta Lake to make operational data in Kafka topics instantly visible to the data lake without the usual ETL—unifying the operational/analytical divide that has been with us for decades.
Videos:
Presentation Slides:
Complete the form below to access the full overview:
.png)
Videos
AI/ML Infra Meetup | AI at scale Architecting Scalable, Deployable and Resilient Infrastructure

Pratik Mishra delivered insights on architecting scalable, deployable, and resilient AI infrastructure at scale. His discussion on fault tolerance, checkpoint optimization, and the democratization of AI compute through AMD's open ecosystem resonated strongly with the challenges teams face in production ML deployments.
September 30, 2025
AI/ML Infra Meetup | Alluxio + S3 A Tiered Architecture for Latency-Critical, Semantically-Rich Workloads

In this talk, Bin Fan, VP of Technology at Alluxio, presents on building tiered architectures that bring sub-millisecond latency to S3-based workloads. The comparison showing Alluxio's 45x performance improvement over S3 Standard and 5x over S3 Express One Zone demonstrated the critical role the performance & caching layer plays in modern AI infrastructure.
September 30, 2025
AI/ML Infra Meetup | Achieving Double-Digit Millisecond Offline Feature Stores with Alluxio

In this talk, Greg Lindstrom shared how Blackout Power Trading achieved double-digit millisecond offline feature store performance using Alluxio, a game-changer for real-time power trading where every millisecond counts. The 60x latency reduction for inference queries was particularly impressive.
September 30, 2025