Powered By Alluxio

Alibaba Cloud
Alibaba Cloud

Alibaba Cloud is the largest cloud computing company in China. It integrates Alluxio with its OSS(open storage service), and leverages Alluxio as a fast data-access layer on top of OSS.

Arimo
Arimo

Arimo leverages Alluxio’s in-memory capability, improving time-to-results for deep learning models up to 60%.

Baidu
Baidu

Baidu uses Alluxio for running fast SQL queries over globally-distributed databases. Petabyte-scale data was distributed over multiple data centers, and Alluxio accelerates the remote data access and stores the frequently used “hot” data that would be local to the compute nodes.

Barclays
Barclays

Barclays describes how they iteratively process raw data directly from the central data warehouse into Spark and how Alluxio is their key enabling technology.

Bazaarvoice
Bazaarvoice

Bazaarvoice stores massive amount of data on AWS S3 and leverages Alluxio in production to speed up their big data analytics. In this architecture, Alluxio enables data locality, data caching and fixes the semantic differences of AWS S3 storage, achieving 5-10x speed up for their Hive queries running on S3.

China Unicom
China Unicom

China Unicom is the world’s fourth-largest mobile service provider by subscriber base. It is uses in production to accelerate HDFS access for the SparkSQL analytics. It has seen 6-10X performance improvement.

Comcast Business
Comcast Business

Comcast brings Alluxio into its framework stack for operationalizing predictive ML models to improve customer experience, to eliminate bottlenecks in the process from model inception to deployment and monitoring. Alluxio provides the universal data plane in the stack on top of various under-stores (Ex. S3, HDFS, RDBMS).

Cray
Cray

Cray has fused supercomputing with an open, standards-based framework to deliver an industry first: the Cray® Urika®-GX agile analytics platform. Alluxio provides a unified view of enterprise data allowing compute frameworks to access stored data at memory speed and co-locates compute and data with memory-speed access to data while virtualizing across different storage systems.

Ctrip.com
Ctrip

Ctrip is a leading Chinese provider of travel services including accommodation reservation, transportation ticketing. It uses Alluxio to boost performance of Spark SQL workloads and alleviate the pressure on HDFS Name Node. In addition, Alluxio is deployed as the single entry to unify two HDFS clusters.

Didi
Didi

Didi Chuxing is a major Chinese ride-sharing, AI and autonomous technology company. It leverages Alluxio for several purposes inside the data analytics platform: (1) accelerating data access from the remote data centers (2) integrating the data from several different data sources from different data centers (3) sharing the data across the jobs and compute frameworks

eSentire
Esentire

eSentire leads the industry in Managed Detection and Response services. It uses Alluxio together with Spark Streaming/SQL and Cassandra in creating an analytics architecture with missions-critical response times to fight cybercrime.

Esri
ESRI

ESRI leverages Alluxio in its mapping and spatial analytics software to read and write geospatial data to a plethora of distributed data stores, such as Amazon S3, HDFS, or OpenStack Swift, including data stores are not natively supported by the ArcGIS platform.

Guardant Health
Guardant Health

Guardant Health is the world leader in comprehensive liquid biopsy. With Alluxio, Minio, and Spark, Guardant Health is able to create a performant and robust yet scalable system to perform large scale data processing in a cloud-native manner.

Huatai
Huatai Securities

Huatai deploys Alluxio Enterprise as the storage layer that unifies data from disparate sources at memory speed, providing high performance and a predictable SLA for leveraging even petabytes of data.

Huawei
Huawei

Huawei bands together with Alluxio to release a big data storage acceleration solution, integrating Huawei’s FusionStorage with Alluxio’s memory-speed virtual distributed storage system, to realize unified data management, improved analysis efficiency, faster application performance and popularize big data for processes including storage, analysis, and archiving.

IBM
IBM Research

IBM deploys Alluxio over Swift and SoftLayer to build a flexible and efficient big data analytics platform

Intel
Intel

Intel uses Alluxio in several scenarios to share data across different applications and computing frameworks, reduce application’s memory consumption and GC overhead, and cache remote data as a local storage manager

JD.com
JD

JD.com is China’s largest online retailer. It uses uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization component. One example of their computing framework, JDPresto, has gained a 10x performance improvement on average by deploying Alluxio.

Kyligence
Kyligence

Kyligence is a big data intelligence company that offers solutions for big data analytics. Alluxio enables effective data management across different storage systems through its use of transparent naming and mounting API. With Alluxio, Kyligence Analytics Platform gained a good balance between performance, cost and management effort in the Cloud.

Lenovo
Lenovo

Lenovo, the number one manufacturer of personal computers and one of the largest smartphone vendors in the world, can now seamlessly and securely access data from data centers worldwide without labor intensive and error-prone ETL, making it available to analytics running in a single data center at in-memory speeds with Alluxio.

Lianjia
Lianjia

Lianjia is the leading online-to-offline real estate agency service in China. Lianjia built an OLAP platform using SparkSQL on top of Alluxio to accelerate Ad-Hoc SQL queries on a large amount of data.

Lucidworks
Lucidworks

Lucidworks leverages Alluxio in the cloud to accelerate remote Solr data access and cloud recovery.

Microsoft
Microsoft

Microsoft AI leverages Alluxio to bridge high computation workloads including TensorFlow jobs with Azure Blob storage seamlessly.

Momo
MOMO

MOMO is a leading mobile pan-entertainment social platform in China. It leverages Alluxio with Spark SQL to Speed Up Ad-hoc Analysis

Myntra
Myntra

Myntra is a leading Indian fashion e-commerce marketplace company. With Alluxio in its data processing pipeline, Myntra improved CX with faster actionable business intelligence from their data. The Myntra team has also contributed to Alluxio open source by documenting how to use Alluxio with Azure blob store for other users.

NetEase
NetEase Games

Netease uses Alluxio to improve the performance of interactive queries on Presto. Alluxio is deployed together with the Presto workers, and accelerates the data access from the remote HDFS clusters.

Nvidia
Nvidia

Nvidia leverages Alluxio as part of its GPU-accelerated data analytics framework to manage different storage systems, and provide a quick and easy access to information within various data lakes.

Oracle
Oracle

Oracle’s Big Data File System is based on Alluxio, and it is designed to accelerate data access for data pipelines with features that significantly improve the runtime performance of Spark applications. BDFS accelerates data access to and from Oracle Cloud Infrastructure Object Storage Classic by providing an active caching layer.

PerceptIn
Perceptin

PerceptIn designed and implemented a cloud architecture using Alluxio that manages enormous amount of incoming data in different storage systems with high throughput and low latency.

Qiniu
Qiniu

Qiniu Cloud Atlab has built AVA, a training platform for deep learning which uses Alluxio to effectively integrate GPU computing resources and storage resources from KODO (the object storage offered by Qiniu Cloud). Alluxio accelerated training tasks to read a large number of sample files such as video and images by 50%.

Qunar.com
Qunar

Qunar leverages Alluxio in product to boost the performance of real-time data analytics, resulting in 15x speedup on average. In addition, it uses Alluxio’s unified namespace to enables different applications and frameworks to easily interact with the data from different storage systems.

Samsung
Samsung

Samsung uses Alluxio with different storage media available in systems including NVME SSDs while providing in-line performance consistent with the speed of the underlying storage media.

Samsung SDS
Samsung SDS

Samsung built its big data analysis platform “Brightics” to leverage Alluxio to manage data in Hadoop ecosystem for user analysis and visualization tool.

Sogou
Sogou

Sogou is one of the largest search engines in China. Alluxio is deployed in its production big data platform with more than 1000 nodes to help improve the reliability of Spark Shuffle service and improve Hive performance.

Suning.com
Suning

Suning is one of the largest non-government retailers in China. It uses Alluxio to unify storage systems and manage multiple HDFS clusters.

TalkingData
Talkingdata

TalkingData is China’s largest data broker covering more than 600M smart devices on a monthly basis. They leverage Alluxio as a single platform to manage all data across disparate data sources on-prem and in cloud, removing complexities by masking the different data sources and providing a unified interface.

Tencent
Tencent

Tencent is a leader in social networking, gaming, e-commerce, mobile and web portal. Tencent News leverages Alluxio with Apache Spark to create a scalable, robust, and performant architecture to provide the best experience to more than 100 million monthly active users of Tencent News.

TwoSigma

Two Sigma is the fifth-largest hedge fund in the world. It uses Alluxio to accelerate the data access from the remote HDFS cluster for the Spark nodes provisioned in the cloud, bringing the training speed of model for algorithmic trading for 10X+ faster.

Vipshop
Vip

Vipshop is a leading online retailer in China that processes and analyzes petabytes of data to answer complex questions like how users are behaving, why a purchase was made, and what ads are most effective. With Alluxio, Vipshop can access, store, and manage data across disparate storage systems on-prem and in the cloud.

Wells Fargo
Wells Fargo

Wells Fargo uses Alluxio to accelerate Spark workloads in their data preparation and exploration pipeline, saving dozens of minutes to load per each iteration. With Alluxio, data is loaded once and can be served from memory for the subsequent accesses, saving hours in workload processing time.