hdfs Archives | Alluxio

“Data Access as a Service” at Shopee: Using Alluxio to Accelerate Interactive Queries and Enhance Developer Experience with Flexible APIs

January 30, 2023 By Tianbao Ding (Shopee) and Haoning Sun (Shopee)

Shopee is the leading e-commerce platform in Southeast Asia. In this blog, Tianbao Ding and Haoning Sun from Shopee’s data infrastructure team share their project on query acceleration and “Data Access as a Service.” They describe how Shopee leverages Alluxio to improve Trino query performance by ~55% and how Alluxio enhances developer experience by providing … Continued

Achieving Hybrid and Multi-Cloud Architecture With Application Portability

October 14, 2022 by Fortune 50 Technology Company

A Fortune 50 technology company that serves over 1 billion users successfully implemented Alluxio to achieve a hybrid cloud strategy, become multi-cloud ready, cut costs, and boost agility. This case study highlights: This tech giant’s cloud journey to modernize its data platform and the challenges it faces Why this company chose Alluxio to achieve hybrid … Continued

Tags: architecture, case study, hdfs, hybrid cloud, multi-cloud, s3

Achieving Hybrid and Multi-Cloud Architecture With Application Portability

August 11, 2022 by Fortune 50 Technology Company

A Fortune 50 technology company has successfully implemented Alluxio to achieve a hybrid-cloud strategy, become multi-cloud ready, cut costs, and boost agility.

Tags: architecture, case study, hdfs, hybrid cloud, multi-cloud, s3

Alluxio and Apache Ranger Best Practices

May 26, 2022

As data stewards and security teams provide broader access to their organization’s data lake environments, having a centralized way to manage fine-grained access policies becomes increasingly important. Learn how Alluxio can use Apache Ranger’s centralized access policies in two ways.

Tags: apache ranger, hdfs, product school

Speed Up Uber’s Presto with Alluxio | A collaboration between Uber and Alluxio – part 1

May 24, 2022 By Chen Liang and Beinan Wang

This article shares how Uber and Alluxio collaborated to design and implement Presto local cache to reduce HDFS latency.

What’s New in Alluxio 2.8: Enhanced S3 API Functionality, Enterprise-grade Security and Data Migration With Better Usability and Low Cost

May 4, 2022 By Adit Madan and Hope Wang

The Alluxio 2.8 version focuses on the S3 API, enterprise-grade security, scalability and observability in data migration. Enhanced S3 API makes managing Alluxio easier than ever. Features such as encryption at rest and policy-driven data management further improve Alluxio’s functionality to support enterprise customers.

Building High-Performance Data Lake Using Apache Hudi and Alluxio at T3Go

November 20, 2020 By Trevor Zhang (T3Go), Vino Yang (T3Go), Jasmine Wang and Bin Fan

How T3Go’s high-performance data lake using Apache Hudi and Alluxio shortened the time for data ingestion into the lake by up to a factor of 2. Data analysts using Presto, Hudi, and Alluxio in conjunction to query data on the lake saw queries speed up by 10 times faster.

Data Consistency Model in Alluxio

October 30, 2020 By Baolong Mao, Jasmine Wang and Bin Fan

When applications are only reading and writing through Alluxio, the Alluxio file system provides strong consistency. However, when clients are writing data across both Alluxio and under storage, the consistency depends on the Alluxio write type and under storage type. This article discusses what to expect in each scenario.

Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-prem storage

September 29, 2020

In this talk, we describe the architecture to migrate analytics workloads incrementally to any public cloud (AWS, Google Cloud Platform, or Microsoft Azure) directly on on-prem data without copying the data to cloud storage.

Tags: cloud, data analytics, data lake, hdfs, hybrid, on-prem, presto, spark, storage

Tag: hdfs