Alluxio - Blog

Welcome to Alluxio.io

Notice anything new about our websites? That’s right - we are super excited to launch our new website - Alluxio.io! As we continue our focus on our open source community, one important item on our mind was to rebuild our website to provide better user experience for our community. To that end, you’ll see lots of changes in the Alluxio web experience.

Recap Spark+AI Summit 2019

Alluxio is a proud sponsor and exhibitor of Spark+AI Summit in San Francisco. What’s Spark+AI Summit? It’s the world’s largest conference that is focused on Apache Spark - Alluxio’s older cousin open source project from the same lab (UC Berkeley’s AMPLab - now RISElab).

Two Ways to Keep Files in Sync Between Alluxio and HDFS

Alluxio provides a distributed data access layer for applications like Spark or Presto to access different underlying file system (or UFS) through a single API in a unified file system namespace. If users only interact with the files in the UFS through Alluxio, since Alluxio has knowledge of any changes the client makes to the UFS, it will keep Alluxio namespace in sync with the UFS namespace.

Moving From Apache Thrift to gRPC: A Perspective From Alluxio

As part of the Alluxio 2.0 release, we have moved our RPC framework from Apache Thrift to gRPC. In this article, we will talk about the reasons behind this change as well as some lessons we learned along the way. In Alluxio 1.x, the RPC communication between clients and servers is built mostly on top of Apache Thrift. Thrift enabled us to define Alluxio service interface in simple IDL files and implement client binding using native Java interfaces generated by Thrift compiler. However, we faced several challenges as we continued developing new features and improvements for Alluxio.

Two Sigma Meetup Recap: Achieving Compute and Storage Independence for Data-driven Workloads

China Unicom Uses Alluxio and Spark to Build New Computing Platform to Serve Mobile Users

China Unicom is one of the five largest telecom operators in the world. China Unicom’s booming business in 4G and 5G networks has to serve an exploding base of hundreds of millions of smartphone users. This unprecedented growth brought enormous challenges and new requirements to the data processing infrastructure. The previous generation of its data processing system was based on IBM midrange computers, Oracle databases, and EMC storage devices. This architecture could not scale to process the amounts of data generated by the rapidly expanding number of mobile users. Even after deploying Hadoop and Greenplum database, it was still difficult to cover critical business scenarios with their varying massive data processing requirements.

Store 1 Billion Files in Alluxio 2.0

Unified Data Access In Virtual Reality

In a recent blog, we discussed the ideation, design and new features in Alluxio 2.0 preview. Today we are thrilled to announce another new revolutionary project that the Alluxio engineering team has been hard at work on for the past year - the Alluxio Virtual Reality (VR) client.

Founder Blog: Alluxio Chapter 2.0

Getting Started with Spark Caching using Alluxio in 5 Minutes

Apache Spark has brought significant innovation to Big Data computing, but its results are even more extraordinary when paired with Alluxio. Alluxio, provides Spark with a reliable data sharing layer, enabling Spark to excel at performing application logic while Alluxio handles storage. Bazaarvoice uses the combination of Spark and Alluxio to provide a real time big data platform that has the ability to not only handle the intake of 1.5 billion page views during peak events like Black Friday, but also provide real time analytics against it (read more). At this scale, the gain in speed is an enabler for new workloads. We’ve established a clean and simple way to integrate Alluxio and Spark.

Enabling Data Location Awareness for Optimized Performance and Lower Cost With Alluxio Tiered Locality

Caching frequently used data in memory is not a new computing technique, however it is a concept that Alluxio has taken to the next level with the ability to aggregate data from multiple storage systems in a unified pool of memory.

Announcing Alluxio 2.0: Preview enabling hyper-scale data workloads in the cloud

We are thrilled and excited to announce the availability of Alluxio 2.0 Preview Release - the largest open source release with the most new features and improvements since the creation of the project. It is now available for download. While Alluxio already enabled data locality and data accessibility for many big data workloads in the cloud, there was still innovation needed in key areas.

Your selections don't match any items.

Alluxio Enterprise AI

Alluxio Enterprise Data

Blog

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer