compute storage separation Archives

High Performance Data Lake with Apache Hudi and Alluxio at T3Go

December 13, 2020

This talk introduces T3Go’s solution in building an enterprise-level data lake based on Apache Hudi & Alluxio, and how to use Alluxio to accelerate the reading and writing of data on the data lake when compute and storage are segregated.

Tags: apache hudi, compute storage separation, data lake, data orchestration, data orchestration summit

Accelerating Data Computation on Ceph Objects using Alluxio

Alluxio Global Online Meetup * November 10, 2020

In this talk, we will present how using Alluxio computation and storage ecosystems can better interact benefiting of the “bringing the data close to the code” approach. Moving away from the complete disaggregation of computation and storage, data locality can enhance the computation performance.

StorageQuery: federated querying on object stores, powered by Alluxio and Presto

August 25, 2020

Alluxio and Presto are a powerful combination to address the compute problem, which is part of the strategy used by Simbiose Ventures to create a product called StorageQuery – A platform to query files in cloud storages with SQL.

Tags: cloud storage, compute storage separation, meetup, object stores, presto, shannondb, sql, storagequery, under filesystem

Running Presto with Alluxio on Amazon EMR

February 12, 2020

Many organizations are leveraging EMR to run big data analytics on public cloud. However, reading and writing data to S3 directly can result in slow and inconsistent performance. Alluxio is a data orchestration layer for the cloud, and in this use case it caches data for S3, ensuring high and predictable performance as well as reduced network traffic.

Tags: aws s3, big data, cloud, compute storage separation, emr, office hour, presto, storage

Workshop: Presto on Alluxio Hands-On Lab

November 12, 2019

Get started with Presto and Alluxio – Hands-on experience launching the EC2 instance, explore the Alluxio filesystem and cluster status, and run queries with Presto on Alluxio

Tags: alluxio, compute storage separation, conference, data orchestration, data orchestration summit, presto

Online Meetup: Powering Data Science and AI with Apache Spark, Alluxio, and IBM

October 29, 2019

Learn why leading companies are moving towards a decoupled compute and storage architecture, and the associated challenges and requirements. Hear about how Spark and Alluxio together can solve the challenges.

Tags: analytics, compute storage separation, hdfs, meetup, performance, spark, use case

Community Office Hour: Accelerating Hive with Alluxio on S3

October 3, 2019

Learn more about Bazaarvoice’s use case leveraging Apache Spark, Hive, and Alluxio on S3. Along with how to set up Hive with Alluxio so that Hive jobs can seamlessly read from/write to S3.

Tags: alluxio engineering, aws s3, compute storage separation, hdfs, hive, office hour, spark

Tag: compute storage separation