This blog post discusses the synergy between Trino and Alluxio, and how to deploy Alluxio as the caching layer for Trino. You will learn
- Why should you choose Alluxio as a cache for Trino
- How do Trino and Alluxio work together
- How to configure Alluxio to point to S3 storage like MinIO
- How to query Alluxio with query write-through from Trino
Trino is an open-source distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources. Trino was designed to handle data warehousing, ETL, and interactive analytics by large amounts of data and producing reports. These workloads are often classified as Online Analytical Processing (OLAP). Alluxio is an open-source data orchestration platform for large-scale analytics and AI. Alluxio sits between compute frameworks such as Trino and Apache Spark and various storage systems like Amazon S3, Google Cloud Storage, HDFS, and MinIO. Alluxio provides distributed caching, unified namespace and tenant isolation, simplifying data access across data centers, regions, or clouds.
This blog post discusses the synergy between Trino and Alluxio, and how to deploy Alluxio as the caching layer for Trino.
1.1 Why Do We Need Caching for Trino
A small fraction of the petabytes of data you store is generating business value at any given time. Repeatedly scanning the same data and transferring it over the network consumes time, compute cycles, and resources. This issue is compounded when pulling data from disparate Trino clusters across regions or clouds. In these circumstances, caching solutions can significantly reduce the latency and cost of your queries.
1.2 The Limitation of Built-in Object File Caching in Trino
Trino has a built-in caching engine, Rubix, in its Hive connector. While this system is convenient as it comes with Trino, it is limited to the Hive connector and has not been maintained since 2020. It also lacks security features and support for additional compute engines.
1.3 Trino on Alluxio
Alluxio connects Trino to various storage systems, providing APIs and a unified namespace for data-driven applications. Alluxio allows Trino to access data regardless of the data source and transparently cache frequently accessed data (e.g., tables commonly used) into Alluxio distributed storage.
Alluxio provides a number of benefits as a distributed cache for Trino. Some of these benefits include:
- Sharing cache data between Trino workers. By deploying a standalone Alluxio cluster, cache data can be shared among all the Trino workers, regardless of whether they belong to different Trino clusters. This allows for efficient use of cached capacity.
- Sharing cache data across Trino clusters and other compute engines. Having a shared cache will speed up the read and write operations will be faster when using Spark and Trino for ETL processing and SQL queries. A common use case is when the data written by Spark is read by Trino, utilizing the shared Alluxio cache can speed up these operations.
- Resilient caching. Scaling down Trino without losing cached data in Alluxio can be achieved by utilizing Alluxio’s data replication mechanism. This allows customers to use spot instances for Trino, potentially reducing costs, while ensuring that cached data is always available and the risk of data loss is minimized, even in case of failures.
- Reduce I/O access latency. By co-locating Alluxio workers with Trino workers, data locality is improved and I/O access latency is reduced when storage systems are remote, or the network is slow or congested. This is because the data is stored and processed within the same physical location, allowing for faster and more efficient access to the cached data.
- Unified data management. Alluxio’s unified API and global namespace can simplify data management for Trino workloads by providing a single interface to access data stored in different systems, regions, or clouds. This can reduce complexity and improve data accessibility, enabling Trino to access data stored in various locations as if it were stored in a single place.
- Metadata consistency. Alluxio utilizes a metadata-sync mechanism to keep the data stored in cache up to date, ensuring that the latest data is always easily accessible. This feature eliminates the need to manually update the cache and enables Trino to access the most current data, improving the performance and accuracy of the workloads.
Overall, Alluxio provides a comprehensive data caching solution for Trino workloads.
2. Real-world Use Cases
2.1 Unifying Cross-region Data Lake at Expedia
Expedia needed to have the ability to query cross-region S3 while simplifying the interface to their local data sources. They implemented Alluxio to federate cross-region data lakes in AWS. Alluxio unifies geo-distributed data silos without replication, enabling consistent and high performance with ~50% reduced costs.
Read more here: Unify Data Lakes Across Multi-Regions in the Cloud.
2.2 Using Alluxio to Power Analytics Queries at Razorpay
Razorpay is a large fintech company in India. Razorpay provides a payment solution that offers a fast, affordable, and secure way to accept and disburse payments online. On the engineering side, the availability and scalability of analytics infrastructure are crucial to providing seamless experiences to internal and external users. Razorpay’s Trino + Alluxio clusters handle 650 daily active users and 100k queries/day with 60 sec P90 and 130 sec P95 query latencies and a 97% success rate.
Check out their blog post here: How Trino and Alluxio Power Analytics at Razorpay.
3. Running Trino on Alluxio Step by Step
We’ve created a demo that demonstrates how to configure Alluxio to use write-through caching with MinIO. This is achieved by using the Iceberg connector and making a single change to the location property on the table from the Trino perspective.
In this demo, Alluxio is run on separate servers, although it’s recommended to run it on the same nodes as Trino. This means that all the configurations for Alluxio will be located on the servers where Alluxio runs, while Trino’s configuration remains unaffected. The advantage of running Alluxio externally is that it won’t compete for resources with Trino, but the disadvantage is that data will need to be transferred over the network when reading from Alluxio. It is crucial for performance that Trino and Alluxio are on the same network.
Watch a full video demonstration of this tutorial below:
To follow this demo, copy the code located in the trino-getting-started repo.
Trino is configured identically to a standard Iceberg configuration. Since Alluxio is running external to Trino, the only configuration needed is at query time and not at startup.
The configuration for Alluxio can all be set using the
alluxio-site.properties file. To keep all configurations colocated on the
docker-compose.yml, we are setting them using Java properties via the
ALLUXIO_JAVA_OPTS environment variable. This tutorial also refers to the master node as the leader and the workers as followers.
The leader exposes ports
19999, the latter being the port for the web UI.
The follower exposes ports
30000, and sets up a shared memory used by Alluxio to store data. This is set to
1G via the
shm_size property and is referenced from the
alluxio.master.hostname=alluxio-leader # Minio configs alluxio.underfs.s3.endpoint=http://minio:9000 alluxio.underfs.s3.disable.dns.buckets=true alluxio.underfs.s3.inherit.acl=false aws.accessKeyId=minio aws.secretKey=minio123 # Demo-only configs alluxio.security.authorization.permission.enabled=false
alluxio.master.hostname needs to be on all nodes, leaders and followers. The majority of shared configs points Alluxio to the
underfs which is MinIO in this case.
alluxio.security.authorization.permission.enabled is set to false to keep the docker setup simple, this is not recommended to do in a production or CI/CD environment.
First, you want to start the services. Make sure that you are in the
trino-getting-started/iceberg/trino-alluxio-iceberg-minio directory. Now run the following command:
docker-compose up -d
You should expect to see the following output. Docker may also have to download the Docker images before you see the “Created/Started” messages, so there could be extra output:
Once this is complete, you can log into the Trino coordinator node. We will do this by using the
exec command and run the
trino CLI executable as the command we run on that container. Notice the container id is
trino-alluxio-iceberg-minio-trino-coordinator-1 so the command you will run is:
docker container exec -it trino-alluxio-iceberg-minio-trino-coordinator-1 trino
When you start this step, you should see the
trino cursor once the startup is complete. It should look like this when it is done:
To best understand how this configuration works, let’s create an Iceberg table using a CTAS (CREATE TABLE AS) query that pushes data from one of the TPC connectors into Iceberg that points to MinIO. The TPC connectors generate data on the fly so that we can run simple tests like this.
First, run a command to show the catalogs to see the
iceberg catalogs since these are what we will use in the CTAS query.
You should see that the iceberg catalog is registered.
Upon startup, the following command is executed on an intiailization container that includes the
mc CLI for MinIO. This creates a bucket in MinIO called
/alluxio which gives us a location to write our data to and we can tell Trino where to find it.
/bin/sh -c " until (/usr/bin/mc config host add minio http://minio:9000 minio minio123) do echo '...waiting...' && sleep 1; done; /usr/bin/mc rm -r --force minio/alluxio; /usr/bin/mc mb minio/alluxio; /usr/bin/mc policy set public minio/alluxio; exit 0; "
Note: This bucket will act as the mount point for Alluxio. So the schema directory
alluxio://lakehouse/ in Alluxio will map to
Let’s move to creating our
SCHEMA that points us to the bucket in MinIO and then run our CTAS query. Back in the terminal create the
SCHEMA. This will be the first call to the metastore to save the location of the schema location in the Alluxio namespace. Notice, we will need to specify the hostname
alluxio-leader and port
19998 since we did not set Alluxio as the default filesystem. Take this into consideration if you want Alluxio caching to be the default usage and transparent to users managing DDL statements.
CREATE SCHEMA iceberg.lakehouse WITH (location = 'alluxio://alluxio-leader:19998/lakehouse/');
Now that we have a SCHEMA that references the bucket where we store our tables in Alluxio which syncs to MinIO, we now can create our first table.
Optional: To view your queries run, log into the Trino UI and log in using any username (it doesn’t matter since no security is set up).
Move the customer data from the tiny generated tpch data into MinIO uing a CTAS query. Run the following query and if you like, watch it running on the Trino UI:
CREATE TABLE iceberg.lakehouse.customer WITH ( format = 'ORC', location = 'alluxio://alluxio-leader:19998/lakehouse/customer/' ) AS SELECT * FROM tpch.tiny.customer;
Go to the Alluxio UI and the MinIO UI, and browse the Alluxio and MinIO files and you will now see a
lakehouse directory that contains a
customer directory that contains the data written by Trino to Alluxio and Alluxio writing it to MinIO.
Now there is a table under Alluxio and MinIO, you can query this data by checking the following.
SELECT * FROM iceberg.lakehouse.customer LIMIT 10;
How are we sure that Trino is actually reading from Alluxio and not MinIO? Let’s delete the data in MinIO and run the query again just to be sure. Once you delete this data, you should still see data return.
Once you complete this tutorial, the resources used for this excercise can be released by runnning the following command:
4. Learn More
To learn more about Alluxio, join 10k+ members in the Alluxio community slack channel to ask any questions and provide your feedback: https://alluxio.io/slack.
To learn more about the Trino community, go to https://trino.io/community.html.