Real-time data analytics is becoming increasingly important for Internet companies like Qunar, the leading travel search engine in China. Alluxio, a memory speed virtual distributed storage system, plays an important role in the big data ecosystem, and brings great performance improvements to applications. In this article, we present how Alluxio is used as the storage management layer in Qunar’s stream processing platform, and how we use Alluxio to improve performance. At Qunar, we have been running Alluxio in production for over 9 months, and have observed 15x performance improvement on average, and 300x improvement at peak service times. At Qunar, the streaming platform processes around 6 billion system log entries (4.5 TB) daily. Many jobs running on the platform are business critical, and therefore impose strict requirements on both stability and low latency. For example, our real-time user recommendations are primarily generated based on the log analysis of user’s click and search behavior. Faster analysis delivers more accurate feedback to the users. Therefore low latency and high stability are the top priorities of our system.
At Qunar, we have been running Alluxio in production for over 9 months, resulting in 15x speedup on average, and 300x speedup at peak service times. In addition, Alluxio’s unified namespace enables different applications and frameworks to easily interact with our data from different storage systems.