Presto on Alluxio: How Netease Games leveraged Alluxio to boost ad hoc SQL on HDFS

Author: Shuang Li (Shuang is a big data engineer at Netease Games, developing and maintaining OLAP related solutions in the data warehouse. He works closely on Apache Kylin and Presto as well as HBase. Shuang graduated from South China University of Technology.) Background As one of the world’s leading online game company, Netease Games is … Continued

Deploying Big Data Workloads on Object Storage Without Performance Penalty

Introduction As the amount of data being collected and analyzed by Enterprises continues to grow unabated, more attention is being placed on managing the cost of storing the data relative to performance. Hadoop provides a scalable and fast way of storing and analyzing data, however, the cost of storing data in Hadoop is typically higher … Continued

Using Alluxio as a Fault-tolerant Pluggable Optimization Component of’s Computation Frameworks is China’s largest online retailer and its biggest overall retailer, as well as the country’s biggest internet company by revenue. Currently,’s BDP platform runs more than 400,000 jobs (15+ PB) daily, on a system with more than 15,000 cluster nodes and a total capacity of 210 PB.

Alluxio has run in’s production environment on 100 nodes for six months. See how uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization component.

Tags: , , , , ,

Tencent Case Study: Delivering Customized News to Over 100 Million Users per Month with Alluxio

This post is guest authored from our friends at Tencent: Can He Download or print the case study here Tencent is one of the largest technology companies in the world and a leader in multiple sectors such as social networking, gaming, e-commerce, mobile and web portal. Tencent News, one of Tencent’s many offerings, strives to create a … Continued

Using Alluxio to Improve the Performance and Consistency of HDFS Clusters

Introduction Alluxio is the world’s first memory-speed virtual distributed storage system that bridges applications and underlying storage systems, providing unified data access orders of magnitudes faster than existing solutions. The Hadoop Distributed File System (HDFS) is a distributed file system for storing large volumes of data. HDFS popularized the paradigm of bringing computation to data … Continued