Alluxio Community Newsletter - November 2023

HIGHLIGHTS

New Blog | Why Adding NAS/NFS on Object Storage May not Solve Your Data Access Problem of AI

Learn the data access challenges in AI and why commonly used NAS/NFS may not be a good option for you.

Read Now

Event Recap | AI Infra Day

AI Infra Day is now on-demand! Here is a recap blog to help you catch up:

Read Now

New Demo | Speed up Your Data Access by 8x with Alluxio vs S3FS

See how Alluxio Fuse can help you gain significant performance improvement for accessing remote data in S3 compared to S3FS Fuse.

Watch Now

We have new videos releasing every 2 weeks. Subscribe to our channel and stay tuned!

GOOD READS

Blog | Consistent Hashing in Alluxio DORA

Consistent hashing is a special technique that allows hash rings to be expanded or shrunk dynamically with minimal disruption. In this engineering blog, Jiaming introduces different consistent hash algorithms and compares them with experiment results.

Read Now

Blog | GPUs Are Fast, I/O is Your Bottleneck

Are you aware of the I/O challenges of feeding the GPU beast? If data can’t make its way to the GPU fast enough to keep up with its computations, cycles are wasted. Here’s how you can deal with it.

Read Now

Dzone | A Deep Dive Into Different Types of Caching in Presto

Learn all different types of caching in Presto and multi-level caching architecture:

Read Now

Dzone | Speed Trino Queries With These Performance-Tuning Tips

Learn how to identifying performance bottlenecks of Trino with these tuning tips:

Read Now

Datanami | Beyond the Hype: 10 Core Principles for AI Success

Alluxio’s SVP of Customer Success, Omid Razavi, shares 10 core principles for AI success and how organizations can implement them.

Read Now

Upcoming Events

PrestoCon | Dec 6, 3:50PM (PT) | Presto Optimization with Distributed Caching on Data Lake

Join us at the upcoming PrestoCon 2023 where Beinan Wang, Senior Staff Engineer at Alluxio, and Hope Wang, Developer Advocate at Alluxio, will be sharing insights into the challenges of data locality and query latency in Presto powered data lakes. Don’t miss out on this opportunity to secure your spot!

Register Now

PyData Global | Dec 7, 6:30PM (UTC) | Maximize GPU Utilization for Model Training

When training models on large datasets, one of the biggest challenges is low GPU utilization. Join this session, where Lu Qiu, machine learning engineer at Alluxio will discuss strategies for maximizing GPU utilization by using the open-source stack of PyTorch+Alluxio+S3.

Register Now

OSA Con | Dec 13, 12:30PM (PT) | Maximizing Query Speed and Minimizing Costs in Data Lakes with Open-Source Caching

As data lakes scale in complexity and size, companies face challenges with slow and inconsistent data access, rapidly growing storage costs, and high operation costs when migrating to the cloud. In this talk, Beinan Wang, senior staff engineer at Alluixo, will discuss an open-source caching framework designed to improve performance by 1.5x and reduce storage costs by millions per year.

Register Now

Past events on-demand

Lightning Talk | Using Decentralized File System to Optimize ML/AI Workloads on Kubernetes

Watch this latest lightning talk at CloudNativeCon 2023 by Shawn Sun, Software Engineer at Alluxio. This presentation discusses how a decentralized virtual file system can overcome data management and security challenges when scaling AI/ML training jobs on traditional centralized file systems

Watch Now

Session On-demand | The Journey Along the Way to Data-locality on Cloud for AI/ML

Alluxio’s Lu Qiu (Machine Learning Engineer) and Shawn Sun (Software Engineer) explored the crucial role of data locality in modern Kubernetes clusters and how it impacts the overall efficiency of large-scale Analytical and AI applications at the recent KubeCon in Chicago!

Watch Now

Got a tech question for the Alluxio Community? Chat with us on Slack!

Be our stargazers on GitHub ⭐

If you like our product, please give it a star on GitHub, and share the goodness!

WHITEPAPERs

“Zero-Copy” Hybrid Bursting with no App Changes:

Alluxio Architecture and Data Flow

Evaluating Apache Spark and Alluxio for Data Analytics Benchmarking Recommendations and Results

Spark with Alluxio Overview – Pair Spark with Alluxio to Modernize Your Data Platform

Presto with Alluxio Overview – Architecture Evolution for Interactive Queries

Accelerating Machine Learning / Deep Learning in the Cloud: Architecture and Benchmark

HOT JOBS

We currently have 30+ opportunities across the globe! Learn more about our job openings in Customer Success, Sales, Product, and Engineering teams. Are you awesome or know of anyone to refer? Check out the full list of opportunities and apply here.

Senior Account Support Engineer (San Mateo, California)

Senior Solutions Engineer (San Mateo, California)

Senior Account Executive (San Mateo, California)

Software Engineering Manager (San Mateo, California)