In this blog, Derek Tan, Executive Director of Infra & Simulation at WeRide, describes how engineers leverage Alluxio as a hybrid cloud data gateway for applications on-premises to access public cloud storage like AWS S3.
“Zero-copy” HYBRID Bursting With No ApP changes
FEATURED USE CASE
Is your compute capacity limited? Bursting to the cloud using data on-prem can bring the compute flexibility you need. Intelligently burst processing to cloud data services like EMR and Dataproc with Alluxio Data Orchestration.
what is zero-copy burst?
Hybrid Cloud is the concept that you want to take advantage of resources that are not local to your data. It may be large data lakes or silos and you want to leverage some compute capacity in the cloud. Zero-Copy burst enables you to burst, or move, your remote data closer to compute in the cloud for these benefits:
Time to Production
Expand your cloud footprint with significantly lower lag
Reduce overload on existing infrastructure by moving ephemeral workloads to the cloud
One step closer to the cloud
Use zero-copy burst as the intermediate step before migrating fully to the cloud
spending too much time maintaining data copies?
Bursting your on-prem workloads to the cloud can mean slow performance and managing duplicate data/application changes.
Using S3 via HDFS leads to low performance due to network latency
Copying data via DistCP from on-prem to cloud means maintaining duplicate data
Using other storage systems like S3 means expensive application changes
intelligently burst processing to the cloud with alluxio
Alluxio’s data orchestration platform leaves your data on-prem and intelligently bursts processing to cloud data services like EMR and Dataproc.
Intelligently burst HDFS workloads to the cloud
Deploy Alluxio + compute on-prem and S3 in the cloud
Want help getting started on zero-copy hybrid bursting? Schedule a meeting with one of our solution engineers.
configuring alluxio + hdfs in the public cloud
You can zero-copy burst your workloads to AWS, GCP, and Azure with Alluxio. By bringing the data to the analytics and machine learning applications, the performance is the same as having the data co-located in the cloud. Plus, the on-prem data stores will have offloaded the computation and minimized the additional I/O overhead.
LATEST RELATED POSTS
This whitepaper details how to leverage a public cloud, such as Amazon AWS, Google GCP, or Microsoft Azure to scale analytic workloads directly on … Continued