Alibaba Cloud is the largest cloud computing company in China. It integrates Alluxio with its OSS(open storage service), and leverages Alluxio as a fast data-access layer on top of OSS.
Powered By Alluxio
Arimo leverages Alluxio’s in-memory capability, improving time-to-results for deep learning models up to 60%.
Baidu uses Alluxio for running fast SQL queries over globally-distributed databases. Petabyte-scale data was distributed over multiple data centers, and Alluxio accelerates the remote data access and stores the frequently used “hot” data that would be local to the compute nodes.
Baidu’s Pingo Data Platform is using Alluxio to implement the access control and management layer for file system access. With Alluxio Pingo users are able to mount external data sources from HDFS, S3, Baidu Object Store or even local file system into the platform
Barclays describes how they iteratively process raw data directly from the central data warehouse into Spark and how Alluxio is their key enabling technology.
Bazaarvoice stores massive amount of data on AWS S3 and leverages Alluxio in production to speed up their big data analytics. In this architecture, Alluxio enables data locality, data caching and fixes the semantic differences of AWS S3 storage, achieving 5-10x speed up for their Hive queries running on S3.
China Unicom is the world’s fourth-largest mobile service provider by subscriber base. It is uses in production to accelerate HDFS access for the SparkSQL analytics. It has seen 6-10X performance improvement.
Comcast brings Alluxio into its framework stack for operationalizing predictive ML models to improve customer experience, to eliminate bottlenecks in the process from model inception to deployment and monitoring. Alluxio provides the universal data plane in the stack on top of various under-stores (Ex. S3, HDFS, RDBMS).
Cray has fused supercomputing with an open, standards-based framework to deliver an industry first: the Cray® Urika®-GX agile analytics platform. Alluxio provides a unified view of enterprise data allowing compute frameworks to access stored data at memory speed and co-locates compute and data with memory-speed access to data while virtualizing across different storage systems.