Alluxio in MOMO: Accelerating Ad Hoc Analysis
From our friends at MOMO
MOMO, a leading pan-entertainment social platform in China, has deployed Alluxio to accelerate ad-hoc query analytics. In the course of evaluating the best fit for Alluxio in their infrastructure they conducted several performance tests to understand how ad-hoc query analytics behaved in several scenarios. These tests give real-world insight to the performance benefits Alluxio provides. The MOMO findings include:
- With Alluxio, performance was improved 3-5x over the current mode
- Even when initially reading ‘cold’ data Alluxio delivered superior performance in most cases
- Alluxio can effectively scale-out to improve performance as requirements grow
View full whitepaper here.
JD uses Alluxio for Ad Hoc and Real Time Stream Computing
JD.com is China’s largest online retailer and its biggest overall retailer, as well as the country’s biggest internet company by revenue. Currently, JD.com’s BDP platform runs more than 400,000 jobs (15+ PB) daily, on a system with more than 15,000 nodes and a total capacity of 210 PB.
Alluxio has run in JD.com’s production environment on 100 nodes for six months. This presentation explains how JD.com uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization component. To give just one example, one framework, JDPresto, has seen a 10x performance improvement on average. This work has also extended Alluxio and enhanced the syncing between Alluxio and HDFS for consistency.
Alluxio in Talking Data
From our friends at TalkingData
TalkingData is China’s largest data broker, reaching more than 600 million smart devices on a monthly basis. TalkingData processes over 20 terabytes of data and more than one billion session requests every day. TalkingData products are powered by its massive proprietary data set and provide services to over 120,000 mobile applications and 100,000 application developers. TalkingData serves a wide range of clients in both Internet and traditional industries, including leading enterprises in the financial services, real estate, retail, travel, and government sectors.
We leverage Alluxio as a single platform to manage all the data across disparate data sources on-premise and in the cloud. Alluxio removes the complexity of our environment by abstracting the different data sources and providing a unified interface. Applications simply interact with Alluxio, and Alluxio manages data access to different storage systems on behalf of the applications.
See full blog here.
The Practice of Alluxio in Near Real-Time Data Platform at VIPShop
Vipshop is a leading online retailer in China that processes and analyzes petabytes of data to answer complex questions like how users are behaving, why a purchase was made, and what ads are most effective. With Alluxio, Vipshop can access, store, and manage data across disparate storage systems on-prem and in the cloud.