aws Archives | Alluxio

Unify Data Lakes Across Multiple Geographic Regions in the Cloud

October 14, 2022 by Expedia Group

Expedia Group has implemented Alluxio to federate cross-region data lakes in AWS. Alluxio unifies geo-distributed data silos without replication, enabling consistent and high performance with ~50% reduced costs. This case study highlights: Expedia’s modernized data platform with central data lake and the challenges it faces Why this company chose Alluxio to unify cross-region data lake … Continued

Tags: aws, data lake, data mesh, data silos, Trino

Unifying Cross-region Access in the Cloud at Expedia Group — The Path Toward Data Mesh in the Brand World

July 29, 2022 By Jian Li (Senior Software Engineer @ Expedia Group)

This article shares the data platform practice at Expedia to federate cross-region data lakes spanning multiple geographic regions in the cloud. 1. Background Expedia Group (NASDAQ: EXPE) is an American online travel shopping company for consumer and small business travel. Expedia powers travel for everyone, everywhere through our global platform, with industry-leading technology solutions to … Continued

Integrating Open Source Alluxio in AWS EKS with Terraform

April 1, 2021

The presentation talks about the best practices to set up and techniques to build a cluster with open source Alluxio on AWS EKS, for one of our clients, which made it Scalable, Reliable, and Secure by adapting to Kubernetes RBAC.

Tags: aws, data orchestration, eks, kubernetes, meetup, terraform

Integrating Open Source Alluxio in AWS EKS with Terraform

Global Online Meetup * March 30, 2021

Building a high-performance platform on AWS to support real-time gaming services using Presto, Alluxio, and S3

December 13, 2020

Electronic Arts (EA) is a leading company in the gaming industry, providing over a thousand games to serve billions of users worldwide. The EA Data & AI Department builds hundreds of platforms to manage petabytes of data generated by games and users every day. These platforms consist of a wide range of data analytics, from real-time data ingestion to ETL pipelines. Formatted data produced by our department is widely adopted by executives, producers, product managers, game engineers, and designers for marketing and monetization, game design, customer engagement, player retention, and end-user experience.

Tags: aws, data orchestration, data orchestration summit, electronic arts, presto, s3

Building a high-performance platform on AWS to support real-time gaming services using Presto and Alluxio

August 4, 2020 By Teng Wang (Electronic Arts), Du Li (Electronic Arts), Yu Jin (Electronic Arts) and Sundeep Narravula (Electronic Arts)

This blog explores an innovative platform with Presto as the computing engine and Alluxio as a data orchestration layer between Presto and S3 storage, to support online services with instantaneous response within the gaming industry. The preliminary results show that Presto with Alluxio outperforms S3 significantly in all cases.Alluxio with metadata caching shows up to 5.9x performance gain when handling large numbers of small files.

Bursting Spark or Presto Jobs to AWS using Alluxio

June 23, 2020

In this office hour, we demonstrate how a “zero-copy burst” solution helps to speed up Spark and Presto queries in the public cloud while eliminating the process of manually copying and synchronizing data from the on-premise data lake to cloud storage. This approach allows compute frameworks to decouple from on-premise data sources and scale efficiently by leveraging Alluxio and public cloud resources such as AWS.

Tags: aws, cloud storage, compute, hdfs, hybrid cloud, office hour, performance, presto, spark, zero copy bursting

Bursting Spark or Presto Jobs to AWS using Alluxio

Community Online Office Hour * June 23, 2020

Tag: aws