Real-Time Analytics: Going Beyond Stream Processing With Apache Pinot

September 15, 2022

Karin Wolok

Head of Developer Community

StarTree

ALLUXIO DAY XV 2022

September 15, 2022

Streaming systems form the backbone of the modern data pipeline as the stream processing capabilities provide insights on events as they arrive. But what if we want to go further than this and execute analytical queries on this real-time data? That’s where Apache Pinot comes in.

OLAP databases used for analytical workloads traditionally executed queries on yesterday’s data with query latency in the 10s of seconds. The emergence of real-time analytics has changed all this and the expectation is that we should now be able to run thousands of queries per second on fresh data with query latencies typically seen on OLTP databases.

Apache Pinot is a realtime distributed OLAP datastore, which is used to deliver scalable real time analytics with low latency. It can ingest data from streaming sources like Kafka, as well as from batch data sources (S3, HDFS, Azure Data Lake, Google Cloud Storage), and provides a layer of indexing techniques that can be used to maximize the performance of queries.

Come to this talk to learn how you can add real-time analytics capability to your data pipeline.

ALLUXIO DAY XV 2022

September 15, 2022

Come to this talk to learn how you can add real-time analytics capability to your data pipeline.

Video:

Presentation Slides:

Real-Time Analytics: Going Beyond Stream Processing With Apache Pinot from Alluxio, Inc.

‍

Videos:

Presentation Slides:

Real-Time Analytics: Going Beyond Stream Processing With Apache Pinot from Alluxio, Inc.

Video:

Presentation Slides:

Real-Time Analytics: Going Beyond Stream Processing With Apache Pinot from Alluxio, Inc.

‍

Videos:

Presentation Slides:

Real-Time Analytics: Going Beyond Stream Processing With Apache Pinot from Alluxio, Inc.

Complete the form below to access the full overview:

Videos

AI/ML Infra Meetup Accelerating the Data Path to the GPU for AI and Beyond

In this talk, Sandeep Joshi, , Senior Manager at NVIDIA, shares how to accelerate the data access between GPU and storage for AI. Sandeep will dive into two options: CPU- initiated GPUDirect Storage and GPU-initiated SCADA.

August 14, 2025

AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access

Bin Fan, VP of Technology at Alluxio, introduces how Alluxio, a software layer transparently sits between application and S3 (or other object stores), provides sub-ms time to first byte (TTFB) solution, with up to 45x lower latency.

August 14, 2025

AI/ML Infra Meetup | LLM Agents and Implementation Challenges

In this talk, Pritish Udgata from Adobe provides a comprehensive overview of implementation challenges and solutions for LLM agents.

Topic include:

CoT vs RAG vs Agentic AI
Anatomy of an agent
Single Agent with MCP
Multi Agents with A2A
Implementation Challenges and Solutions

August 14, 2025

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo

Alluxio Enterprise AI

Alluxio Enterprise Data

ALLUXIO DAY XV 2022

ALLUXIO DAY XV 2022

Videos:

Presentation Slides:

Videos:

Presentation Slides:

Complete the form below to access the full overview:

Videos

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer