Introduction To Alluxio (formerly Tachyon) and How It Brings Up To 300x Performance Improvement To Qunar’s Streaming Processing

Tags: , , , , , , , , ,

STRATA DATA CONFERENCE LONDON 2017

Alluxio—the first memory-speed virtual distributed storage system in the world—unifies the data from various under storage systems and presents a global namespace to various computation frameworks. Data access can be several magnitudes faster because of Alluxio’s memory-centric architecture. In addition, Alluxio’s tiered storage, unified namespace, flexible file API, web UI, and command-line tools increase the usability in different application scenarios. The Alluxio open source project is one of the fastest growing big data projects, with more than 600 contributors from more than 100 companies across the world.

Qunar is the number-one Chinese-language online travel information provider and search engine for web-based and mobile users. Currently, Qunar’s streaming platform processes around 6 billion system log entries (~4.5 TB) daily. Many jobs running on the platform are business critical and therefore impose strict requirements on both stability and low latency. For example, real-time user recommendations are generated mainly based on the log analysis of a user’s click behavior as well as the search pattern. The faster the iteration of the analysis, the more accurate the feedback that Qunar can deliver to the users. Therefore low latency and high stability are the top priorities of its system.

Xueyan Li and Yupeng Fu explore how Alluxio has led to performance improvements averaging a 300x improvement at service peak time on stream processing workloads at Qunar.

Presentation Slides:

Watch video here: