On Demand Video

10x Faster Trino Queries on Your Data Platform

As Trino users increasingly rely on cloud object storage for retrieving data, speed and cloud cost have become major challenges. The separation of compute and storage creates latency challenges when querying datasets; scanning data between storage and compute tiers becomes I/O bound. On the other hand, cloud API costs related to GET/LIST operations and cross-region data transfer add up quickly.

The newly introduced Trino file system cache by Alluxio aims to overcome the above challenges. In this session, Jianjian will dive into Trino data caching strategies, the latest test results, and discuss the multi-level caching architecture. This architecture makes Trino 10x faster for data lakes of any scale, from GB to EB.

What you will learn:

  • Challenges relating to the speed and costs of running Trino in the cloud
  • The new Trino file system cache feature overview, including the latest development status and test results
  • A multi-level cache framework for maximized speed, including Trino file system cache and Alluxio distributed cache
  • Real-world cases, including a large online payment firm and a top ridesharing company
  • The future roadmap of Trino file system cache and Trino-Alluxio integration

Jianjian Xie
is a Staff Software Engineer at Alluxio and an open-source contributor to Alluxio and Trino. He is currently focusing on distributed cache systems for interactive data analytics and machine learning. Before joining Alluxio, he was a research engineer at Indeed.nt, and organizing events with adjacent ecosystem communities.