aws s3 Archives

Reducing Large S3 API Costs Using Alluxio

July 30, 2020 By Juraj Pohanka (datasapiens), Koen Michiels (datasapiens) and Sam Gilbert (datasapiens)

This article described how engineers at datasapiens brought down S3 API costs by 200x by implementing Alluxio as a data orchestration layer between S3 and Presto.

Building a Cross-Region Hybrid Cloud Storage Gateway for Machine Learning & AI at WeRide

July 8, 2020 By Derek Tan (WeRide) and Jasmine Wang

In this blog, Derek Tan, Executive Director of Infra & Simulation at WeRide, describes how engineers leverage Alluxio as a hybrid cloud data gateway for applications on-premises to access public cloud storage like AWS S3.

How to Build a new Under Filesystem in Alluxio: Apache Ozone as an Example

July 7, 2020

In Alluxio, an Under File System is the plugin to connect to any file systems or object stores, so users can mount different storages like AWS S3 or HDFS into Alluxio namespace. This under filesystem is designed to be modular, in order to enable users to easily extend this framework with their own Under File System implementation and connect to a new or customized storage system.

Tags: apache ozone, aws s3, hdfs, meetup, object stores, storage, under filesystem

How to Build a new Under Filesystem in Alluxio: Apache Ozone as an Example

Alluxio Global Online Meetup * June 30, 2020

Alluxio Accelerates Deep Learning in Hybrid Cloud using Intel’s Analytics Zoo open source platform powered by oneAPI

April 27, 2020 By Bin Fan

This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed.

Running Presto with Alluxio on Amazon EMR

February 12, 2020

Many organizations are leveraging EMR to run big data analytics on public cloud. However, reading and writing data to S3 directly can result in slow and inconsistent performance. Alluxio is a data orchestration layer for the cloud, and in this use case it caches data for S3, ensuring high and predictable performance as well as reduced network traffic.

Tags: aws s3, big data, cloud, compute storage separation, emr, office hour, presto, storage

Introducing Wormhole: Dockerized Presto & Alluxio setups for blazing fast analytics

October 29, 2019 By Ashwin Sinha

This is a guest blog by Ashwin Sinha with an original blog source. This blog introduces Wormhole— open source Dockerized solution for deploying Presto & Alluxio clusters for blazing fast analytics on file system (we use S3, GCS, OSS). When it comes to analytics, generally people are hands-on in writing SQL queries and love to analyse data which resides in a warehouse (e.g. MySQL database). But as data grows, these … Continued

How does the WANdisco Hybrid Data Lake Solution in AWS compare to zero-copy bursting to the cloud?

How do WANdisco and Alluxio hybrid solutions stack up? Learn more.

Online Meetup: AWS S3 + Alluxio + Presto = ❤️ The Ryte Use Case

October 10, 2019

This online meetup shows why and how we solve some challenging technical issues, improve the speed, and reduce the costs of our AWS EMR Hadoop & Presto -Backend with Alluxio to an awesome level.

Tags: aws, aws s3, emr, hadoop, presto

Tag: aws s3