on-prem object storage Archives | Page 2 of 3

Apache Spark Pipelines in the Cloud with Alluxio

Spark Summit Europe 2017 * October 25, 2017

In this talk, we discuss how Alluxio can be deployed and used with a Spark data processing pipeline in the cloud. We show how pipeline stages can share data with Alluxio memory for improved performance benefits, and how Alluxio can improves completion times and reduces performance variability for Spark pipelines in the cloud.

Accelerating Spark Workloads in a Mesos Environment with Alluxio

MesosCon Europe 2017 * October 27, 2017

Using Alluxio, a memory speed virtual distributed storage system, deployed on Mesos enables connecting any compute framework, such as Apache Spark, to storage systems via a unified namespace. Alluxio enables applications to interact with any data at memory speed. Alluxio can eliminate the pains of ETL and data duplication, and enable new workloads across all data. Gene will discuss the architecture of Mesos, Spark and Alluxio to achieve an optimal architecture for enterprises.

Powering Robotics Clouds with Alluxio

Strata San Jose * March 7, 2018

The rise of robotics applications demands new cloud architectures that deliver high throughput and low latency. Bin Fan and Shaoshan Liu explain how PerceptIn designed and implemented a cloud architecture to support video streaming and online object recognition tasks and demonstrate how Alluxio supports these emerging cloud architectures.

Alluxio+Presto: An Architecture for Fast SQL in the Cloud

Bay Area Meetup * December 4, 2018

Cloud object storage systems provide different semantics and performance implications compared to HDFS. Applications like Presto cannot benefit from the node-level locality or cross-job caching when reading from the cloud. Deploying Alluxio with Presto to access cloud solves these problems because data will be retrieved and cached in Alluxio instead of the underlying cloud or object storage repeatedly. Bin will present the architecture to combine Presto with Alluxio with use cases from major internet companies like JD.com and NetEase.com, and their lessons learned to operate this architecture at scale.

Unified Big Data Analytics – Any stack, Any Cloud

Boston Meetup * January 22, 2019

This presentation focuses on how Alluxio helps the big data analytics stack to be cloud-native. The trending Cloud object storage systems provide more cost-effective and scalable storage solutions but also different semantics and performance implications compared to HDFS. Applications like Spark or Presto will not benefit from the node-level locality or cross-job caching when retrieving data from the cloud object storage. Deploying Alluxio to access cloud solves these problems because data will be retrieved and cached in Alluxio instead of the underlying cloud or object storage repeatedly.

Alluxio in MOMO, JD.com, TalkingData, and Vipshop [Chinese]

August 24, 2018

Learn more about use cases with Alluxio leveraged in MOMO, JD.com, and TalkingData.

Tags: alluxio engineering, analytics, caching, cloud object storage, cloud storage, compute, compute storage separation, data, on-prem object storage, performance, storage

Enable Fast Big Data Analytics on Ceph with Alluxio

March 20, 2017 by Adit Madan

Ceph Days 2017 – Adit Madan presents on enabling fast big data analytics on Ceph with Alluxio.

Tags: alluxio engineering, analytics, big data, ceph, cloud object storage, cloud storage, conference, data, on-prem object storage, storage

Tag: on-prem object storage