Author: Bin Feng at Alluxio

What’s new in Alluxio 2.4

October 21, 2020 By Lu Qiu and Bin Feng

Alluxio 2.4.0 focuses on features critical to large scale, production deployments in Cloud and Hybrid Cloud environments. Enterprises leverage Alluxio at enormous scale in many dimensions, including number of files, total volume of data, requests per second, and number of concurrent clients.

Moving From Apache Thrift to gRPC: A Perspective From Alluxio

April 13, 2019 By Gokturk Gezer and Bin Feng

As part of the Alluxio 2.0 release, we have moved our RPC framework from Apache Thrift to gRPC. In this article, we will talk about the reasons behind this change as well as some lessons we learned along the way.
In Alluxio 1.x, the RPC communication between clients and servers is built mostly on top of Apache Thrift. Thrift enabled us to define Alluxio service interface in simple IDL files and implement client binding using native Java interfaces generated by Thrift compiler. However, we faced several challenges as we continued developing new features and improvements for Alluxio.

Top 10 Tips for Making the Spark + Alluxio Stack Blazing Fast

December 28, 2018 By Bin Fan, Bin Feng, Gene Pang, Madan Kumar and Reid Chan (Momo)

The Apache Spark + Alluxio stack is getting quite popular particularly for the unification of data access across S3 and HDFS. In addition, compute and storage are increasingly being separated causing larger latencies for queries. Alluxio is leveraged as compute-side virtual storage to improve performance. But to get the best performance, like any technology stack, you need to follow the best practices. This article provides the top 10 tips for performance tuning for real-world workloads when running Spark on Alluxio with data locality giving the most bang for the buck.

Bin Feng

Software Engineer, Alluxio