big data Archives | Page 2 of 7

Running Presto with Alluxio on Amazon EMR

February 12, 2020

Many organizations are leveraging EMR to run big data analytics on public cloud. However, reading and writing data to S3 directly can result in slow and inconsistent performance. Alluxio is a data orchestration layer for the cloud, and in this use case it caches data for S3, ensuring high and predictable performance as well as reduced network traffic.

Tags: aws s3, big data, cloud, compute storage separation, emr, office hour, presto, storage

The Practice of Presto & Alluxio in E-Commerce Big Data Platform

November 15, 2019

JD.com is China’s largest online retailer. It uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization component.

Tags: alluxio, big data, performance, presto, use case

Building data lineage; Running Spark with Alluxio; Data Mesh

Big Data Application Meetup * November 21, 2019

Running Spark with Alluxio is a popular stack particularly for hybrid environments. In this session, Dipti will briefly introduce Alluxio, share the top 10 tips for performance tuning for real-world workloads, and demo Alluxio with Spark.

Enabling big data & AI workloads on the object store at DBS

October 14, 2019

Vitaliy and Dipti dive into how DBS Bank built a modern big data analytics stack, leveraging an object store as persistent storage even for data-intensive workloads, and how it uses Alluxio to orchestrate data locality and data access for Spark workloads.

Tags: aws, big data, conference, hybrid cloud bursting, object stores, unified namespace

Alluxio – Data Orchestration for Analytics and AI in the Cloud

October 9, 2019

In this talk, we present: trends and challenges in the data ecosystem in cloud era; Data engineering in the cloud with data orchestration; Use cases of using tech stacks (Presto or Tensorflow) with Alluxio on S3.

Tags: aws s3, big data, cloud, data orchestration, hdfs, meetup, presto, spark, storage, tensorflow

Tag: big data

The Practice of Presto & Alluxio in E-Commerce Big Data Platform

Building data lineage; Running Spark with Alluxio; Data Mesh

Enabling big data & AI workloads on the object store at DBS

Tachyon: A Reliable Memory Centric Storage For Big Data Analytics