Author: Jasmine Wang at Alluxio

Hopping into the Year of Rabbit with Alluxio Community

January 26, 2023 By Bin Fan, Jasmine Wang, Hope Wang and Chanchan Mao

As we close out the Year of Tiger and welcome the Year of Rabbit, we are filled with gratitude for the support and contributions of the members of Alluxio Open Source Community. Thanks to your dedication and trust, the Alluxio Open Source project and community has continued to thrive and grow in ways we never … Continued

A Year with Alluxio Community 2021

January 20, 2022 By Bin Fan and Jasmine Wang

2021 marked accelerated growth for the Alluxio Open Source Project. We could not be more grateful for what the community has achieved together in this past year. This blog provides a glimpse of the year long summary of our community growth.

Building High-Performance Data Lake Using Apache Hudi and Alluxio at T3Go

November 20, 2020 By Trevor Zhang (T3Go), Vino Yang (T3Go), Jasmine Wang and Bin Fan

How T3Go’s high-performance data lake using Apache Hudi and Alluxio shortened the time for data ingestion into the lake by up to a factor of 2. Data analysts using Presto, Hudi, and Alluxio in conjunction to query data on the lake saw queries speed up by 10 times faster.

Data Consistency Model in Alluxio

October 30, 2020 By Baolong Mao, Jasmine Wang and Bin Fan

When applications are only reading and writing through Alluxio, the Alluxio file system provides strong consistency. However, when clients are writing data across both Alluxio and under storage, the consistency depends on the Alluxio write type and under storage type. This article discusses what to expect in each scenario.

Adopting Satellite Clusters with Alluxio at Vipshop to Improve Spark Jobs for Targeted Advertising by 30x

July 25, 2020 By Gang Deng (Vipshop) and Jasmine Wang

As the third largest e-commerce site in China, Vipshop processes large amounts of data collected daily to generate targeted advertisements for its consumers. In this article, Gang Deng from Vipshop describes how to meet SLAs by improving struggling Spark jobs on HDFS by up to 30x, and optimize hot data access with Alluxio to create … Continued

Building a Cross-Region Hybrid Cloud Storage Gateway for Machine Learning & AI at WeRide

July 8, 2020 By Derek Tan (WeRide) and Jasmine Wang

In this blog, Derek Tan, Executive Director of Infra & Simulation at WeRide, describes how engineers leverage Alluxio as a hybrid cloud data gateway for applications on-premises to access public cloud storage like AWS S3.