hadoop Archives | Page 2 of 3

Online Meetup: AWS S3 + Alluxio + Presto = ❤️ The Ryte Use Case

October 10, 2019

This online meetup shows why and how we solve some challenging technical issues, improve the speed, and reduce the costs of our AWS EMR Hadoop & Presto -Backend with Alluxio to an awesome level.

Tags: aws, aws s3, emr, hadoop, presto

From limited Hadoop compute capacity to increased data scientist efficiency

Alluxio Tech Talk * October 16, 2019

This tech talk will share approaches to burst data to the cloud along with
how Alluxio can enable “zero-copy” bursting of Spark workloads to cloud data services like EMR and Dataproc. Learn how DBS bank uses Alluxio to solve for limited on-prem compute capacity.

AWS S3 + Alluxio + Presto = ❤️ The Ryte Use Case

Alluxio Open Source Online Meetup * October 9, 2019

In this presentation, Ryte’s Chapter lead engineer, Danny Linden, shows why & how we solve some challenging technical issues, improve the speed, and reduce costs of our AWS EMR Hadoop & Presto -Backend with Alluxio to an awesome level!

Effective Analytical Pipelines on AWS Using EMR, Alluxio, and S3

September 27, 2019 By Yunling Cai

This article describes my lessons from a previous project which moved a data pipeline originally running on a Hadoop cluster managed by my team, to AWS using EMR and S3. The goal was to leverage the elasticity of EMR to offload the operational work, as well as make S3 a data lake where different teams can easily share data across projects.

Building a Large-scale Interactive SQL Query Engine using Presto and Alluxio in JD.com

September 24, 2019 By Baolong Mao

This article describes how JD built this interactive OLAP platform combining two open-source technologies: Presto and Alluxio.

Bay Area Meetup: Alluxio 2.0 Deep Dive and Near Real-time Analytics with Spark

July 23, 2019

This meetup presents an overview of the motivations and design decisions behind the major changes in the Alluxio 2.0 release, and Real-time Data Processing for Sales Attribution Analysis with Alluxio, Spark and Hive at VIPShop.

Tags: alluxio engineering, apache hadoop, apache spark, compute, compute storage separation, data, data orchestration, hadoop, hdfs, meetup, scale, spark, storage

The Practice of Alluxio in Ctrip Real-Time Computing Platform

July 19, 2019 By Jianhua Guo

Today, real-time computation platform is becoming increasingly important in many organizations. In this article, we will describe how ctrip.com applies Alluxio to accelerate the Spark SQL real-time jobs and maintain the jobs’ consistency during the downtime of our internal data lake (HDFS). In addition, we leverage Alluxio as a caching layer to dramatically reduce the workload pressure on our HDFS NameNode.

Recap: Presto Summit SF 2019

July 1, 2019 By Amelia Wong

Alluxio is a proud sponsor and exhibitor at the Presto Summit in San Francisco.
What’s Presto Summit? It’s the leading Presto conference co-organized by our partner Starburst Data and the Presto Software Foundation.

Decoupling Compute and Storage for Data Workloads

May 31, 2019 by Carlos Queiroz, DBS

Carlos Queiroz of DBS presents on how to decouple compute and storage for data workloads using Alluxio.

Tags: compute storage separation, hadoop, meetup

Tag: hadoop