ALLUXIO COMMUNITY NEWSLETTER

OCTOBER 2019

The Data Orchestration Summit is less than two weeks away. It’s an amazing opportunity to learn about the key challenges and solutions for building modern data and AI platforms. Hear how others are tackling the toughest data engineering problems, discover interesting use cases, and compare different approaches and technologies. There are only a few seats remaining for the Presto and Alluxio hands-on lab that we are running with the creators of Presto. See schedule for more details, and save your seat now!

GREAT READS

Data Orchestration: What Is it, Why Is it Important?

Read the Q&A explaining more on a data orchestration platform that brings your data closer to compute across clusters, regions, clouds, and countries.

Effective Analytical Pipelines on AWS Using EMR, Alluxio, and S3

This article shares learnings on moving a data pipeline originally running on a Hadoop cluster to AWS using EMR and S3.

Introducing Wormhole: Dockerized Presto & Alluxio setups for blazing fast analytics

This blog introduces Wormhole, an open source Dockerized solution for deploying Presto & Alluxio clusters for blazing fast analytics on file system.

TUTORIAL

Presto + Alluxio + Hive Metastore on your laptop in 10 min

This article describes how Baidu creates a secure, modular and extensible distributed file system service in project Pingo based on Alluxio. In this article, you will learn how to incorporate Alluxio to implement a unified distributed file system service as well as how to add extensions on top of Alluxio including customized authentication schemes and UDF (user-defined functions) on Alluxio files.

Get Started with EMR Spark on Alluxio in 5 min [Chinese]

Learn how to run Spark jobs against on-premise storage or even a different cloud provider’s storage.

Getting Started with EMR Hive on Alluxio in 10 Minutes

How to set up an EMR cluster with Alluxio as a distributed caching layer for Hive, and run sample queries to access data in S3 through Alluxio.

Configuring Alluxio in the cloud with on-prem HDFS

How to configure Alluxio with a single master in a cluster and use HDFS as under storage.

ALLUXIO Events

Data Orchestration Summit + Presto and Alluxio Training

November 7 – Computer History Museum

Tech Talk | How the Development Bank of Singapore solves on-prem compute capacity challenges with cloud bursting

Learn why DBS turned to Alluxio’s bursting approach to solve challenges with their data stack.

Online Meetup: Powering Data Science and AI with Apache Spark, Alluxio, and IBM

Hear about why the leading companies are moving towards a decoupled compute and storage architecture, along with the associated challenges and requirements.

ODSC West | Simplified data preparation for machine learning in hybrid and multi clouds

This session is designed for data scientists or data engineers who work with remote and possibly multiple data sources in hybrid or multi-cloud environments. Learn how to use Alluxio to greatly simplify the data preparation in these environments.

KubeCon

Stop by booth #193 and win yourself a drone!

We hope to see you at the first Data Orchestration Summit, November 7th.

Submit Feedback

Join our Slack channel!
Get your questions answered by the experts in our Slack community channel

ALLUXIO COMMUNITY NEWSLETTER

OCTOBER 2019

GREAT READS

TUTORIAL

ALLUXIO Events

Sign up for our email newsletter