ALLUXIO COMMUNITY NEWSLETTER
OCTOBER 2019
The Data Orchestration Summit is less than two weeks away. It’s an amazing opportunity to learn about the key challenges and solutions for building modern data and AI platforms. Hear how others are tackling the toughest data engineering problems, discover interesting use cases, and compare different approaches and technologies. There are only a few seats remaining for the Presto and Alluxio hands-on lab that we are running with the creators of Presto. See schedule for more details, and save your seat now!
GREAT READS
Data Orchestration: What Is it, Why Is it Important?
Read the Q&A explaining more on a data orchestration platform that brings your data closer to compute across clusters, regions, clouds, and countries.
Effective Analytical Pipelines on AWS Using EMR, Alluxio, and S3
This article shares learnings on moving a data pipeline originally running on a Hadoop cluster to AWS using EMR and S3.
Introducing Wormhole: Dockerized Presto & Alluxio setups for blazing fast analytics
This blog introduces Wormhole, an open source Dockerized solution for deploying Presto & Alluxio clusters for blazing fast analytics on file system.
TUTORIAL
Presto + Alluxio + Hive Metastore on your laptop in 10 min
This article describes how Baidu creates a secure, modular and extensible distributed file system service in project Pingo based on Alluxio. In this article, you will learn how to incorporate Alluxio to implement a unified distributed file system service as well as how to add extensions on top of Alluxio including customized authentication schemes and UDF (user-defined functions) on Alluxio files.
Get Started with EMR Spark on Alluxio in 5 min [Chinese]
Learn how to run Spark jobs against on-premise storage or even a different cloud provider’s storage.
Getting Started with EMR Hive on Alluxio in 10 Minutes
How to set up an EMR cluster with Alluxio as a distributed caching layer for Hive, and run sample queries to access data in S3 through Alluxio.
Configuring Alluxio in the cloud with on-prem HDFS
How to configure Alluxio with a single master in a cluster and use HDFS as under storage.
ALLUXIO Events
Data Orchestration Summit + Presto and Alluxio Training
November 7 – Computer History Museum
Learn why DBS turned to Alluxio’s bursting approach to solve challenges with their data stack.
Online Meetup: Powering Data Science and AI with Apache Spark, Alluxio, and IBM
Hear about why the leading companies are moving towards a decoupled compute and storage architecture, along with the associated challenges and requirements.
ODSC West | Simplified data preparation for machine learning in hybrid and multi clouds
This session is designed for data scientists or data engineers who work with remote and possibly multiple data sources in hybrid or multi-cloud environments. Learn how to use Alluxio to greatly simplify the data preparation in these environments.
Stop by booth #193 and win yourself a drone!
We hope to see you at the first Data Orchestration Summit, November 7th.
Join our Slack channel!
Get your questions answered by the experts in our Slack community channel