Alluxio Blog

Recap: AWS Summit New York

July 22, 2019 By Amelia Wong

Alluxio is a proud sponsor and exhibitor at the AWS Summit in New York. If you weren’t able to attend, here are the highlights

The Practice of Alluxio in Ctrip Real-Time Computing Platform

July 19, 2019 By Jianhua Guo

Today, real-time computation platform is becoming increasingly important in many organizations. In this article, we will describe how ctrip.com applies Alluxio to accelerate the Spark SQL real-time jobs and maintain the jobs’ consistency during the downtime of our internal data lake (HDFS). In addition, we leverage Alluxio as a caching layer to dramatically reduce the workload pressure on our HDFS NameNode.

Getting Started with the Alluxio-Presto Sandbox

July 11, 2019 By Zac Blanco

The Alluxio-Presto sandbox is a docker application featuring installations of MySQL, Hadoop, Hive, Presto, and Alluxio. The sandbox lets you easily dive into an interactive environment where you can explore Alluxio, run queries with Presto, and see the performance benefits of using Alluxio in a big data software stack.

Orchestrating Data for the Cloud World with Alluxio 2.0

July 11, 2019 By Haoyuan Li

Today, I’m thrilled to announce the GA of Alluxio 2.0.0, Alluxio’s biggest release to date (see our Release Notes & Release Blog) with over 900 commits.

Turn cloud storage or HDFS into your local file system for faster AI model training with TensorFlow

July 3, 2019 By Lu Qiu and Bin Fan

This article aims to provide a different approach to help connect and make distributed files systems like HDFS or cloud storage systems look like a local file system to data processing frameworks: the Alluxio POSIX API. To explain the approach better, we used the TensorFlow + Alluxio + AWS S3 stack as an example.

Recap: Presto Summit SF 2019

July 1, 2019 By Amelia Wong

Alluxio is a proud sponsor and exhibitor at the Presto Summit in San Francisco.
What’s Presto Summit? It’s the leading Presto conference co-organized by our partner Starburst Data and the Presto Software Foundation.

Hybrid Environments for Data Analytics is a Possibility

June 21, 2019 By Madan Kumar and Adit Madan

As the data ecosystem becomes massively complex and more and more disaggregated, data analysts and end users have trouble adapting and working with hybrid environments. The proliferation of compute applications along with storage mediums leads to a hybrid model that we are just not accustomed to.
With this disaggregated system data engineers now come across a multitude of problems that they must overcome in order to get meaningful insights.