ALLUXIO COMMUNITY NEWSLETTER

JULY 2020


Latest Hybrid Cloud Tutorials – Burst to Public Cloud

Upcoming Events


Reducing Large S3 API Costs using Alluxio at Datasapiens
Aug 4 | Alluxio Global Online Meetup

Datasapiens focuses on data-analytics and provides end-to-end service that manages the data pipeline and automates the process of generating data insights. Datasapiens CEO, co-founder Koen Michiels and CTO Juraj Pohanka join Bin from Alluxio to share how engineers at datasapiens brought down S3 API costs by 200x by implementing Alluxio as a data orchestration layer between S3 and Presto.

Enabling Hybrid Cloud Analytics and AI with Data Orchestration
Aug 5 | IoT Webinar

In this talk, Alluxio’s Technical Product Manager Adit Madan and Intel’s Global CTO Parviz Peiravi offer an overview of the Alluxio data orchestration layer that provides a unified data access layer for hybrid and multi cloud deployments, leveraging Intel® Optane™ Persistent Memory for higher performance caching at reduced cost. The data access layer enables distributed compute engines like Presto, TensorFlow, and PyTorch to transparently access data from various storage systems (including S3, HDFS, and Azure) while actively leveraging a multi-tier cache to accelerate data access.

Apache Hudi PMC, AWS, Facebook, Alluxio
Aug 7 | Presto Virtual TechTalks

This is a two part talk. In the first part, Apache Hudi PMC Bhavani Sudha Saktheeswaran will introduce Hudi, discuss different table/query types and how Hudi integrates with Presto to support these queries. In the second talk, Rohit Jain from Facebook and Bin Fan from Alluxio will introduce their teams’ collaboration on adding a local on-SSD Alluxio cache inside Presto workers to improve unsatisfied Presto latency.

Accelerating Queries on Cloud Data Lakes
Aug 20 | ITPro Webinar

Alex Ma from Alluxio and Brien Porter from Intel will discuss how a data orchestration approach offers a solution for connecting traditional on-prem data centers with the cloud, data centers with other data centers, and clouds with other clouds. With Alluxio’s “zero-copy” burst solution, and Intel® Optane™ Persistent Memory for higher performance at reduced cost, companies can bridge remote data centers with computing frameworks in other locations, enabling them to offload compute and leverage the flexibility, scalability, and power of the cloud for their remote data.

StorageQuery: Federated Querying on Object Stores, Powered by Alluxio + Presto
Aug 25 | Alluxio Global Online Meetup

Organizations have worked towards the separation of storage and compute for a number of benefits in the areas of cost, data duplication and data latency. Cloud resolves most of these issues but comes to the expense of needing a way to query data on remote storages. Our guests Caio Pavanelli and Abner Ferreira from Simbiose Ventures share their experience in customizing PrestoSQL and Alluxio for building StorageQuery’s platform.

RECAP OF JULY


On Demand | Building Under File System in Alluxio with Tencent
Baolong Mao from Tencent shares his experience in developing Apache Ozone Under File System, showing how to create a new Alluxio Under File System in a few steps with minimal lines of code. The UFS connects to any file systems or object stores, so users can mount different storages like AWS S3 or HDFS into Alluxio namespace.

On Demand | What’s New in Alluxio 2.3
Alluxio 2.3 is just released at the end of June 2020. Alluxio core mainters Calvin and Bin go over the new features and integrations available and share learnings from the community.

GOOD READS


Blog | Running Presto in a Hybrid Cloud Architecture 

Blog | Building a Cross-Region Hybrid Cloud Storage Gateway for Machine Learning and AI at WeRide

Blog | Adopting Satellite Clusters to Improve Spark Jobs for Targeted Advertising by 30x

Blog | Efficient Model Training in the Cloud with Kubernetes, TensorFlow, and Alluxio


Image

Join our Slack channel! 

Get your questions answered by the experts in our Slack community channel