distributed systems Archives

Zookeeper vs Raft: Stateful Distributed Coordination with HA and Fault Tolerance

October 21, 2022

Big Data Bellevue & Cloudy With a Chance of Data Meetup October 20, 2022 Distributed systems are made up of many components such as authentication, a persistence layer, stateless services, load balancers, and stateful coordination services. These coordination services are central to the operation of the system, performing tasks such as maintaining system configuration state, … Continued

Tags: big data, distributed systems, fault tolerance, high availability, meetup, raft, zookeeper

Best Practice in Accelerating Data Applications with Spark+Alluxio

October 12, 2021

This talk shares the designs and use cases of the Alluxio and Spark integrated solutions, as well as the best practice and “what not to do” in designing and implementing Alluxio distributed systems.

Tags: alluxio day, big data, data orchestration, distributed systems, spark

Building a Cross-Region Hybrid Cloud Storage Gateway for Machine Learning & AI at WeRide

July 8, 2020 By Derek Tan (WeRide) and Jasmine Wang

In this blog, Derek Tan, Executive Director of Infra & Simulation at WeRide, describes how engineers leverage Alluxio as a hybrid cloud data gateway for applications on-premises to access public cloud storage like AWS S3.

Scalable and Highly-available Distributed File System Metadata Service Using gRPC, RocksDB and RAFT

April 7, 2020

Alluxio (alluxio.io) is an open-source data orchestration system that provides a single namespace federating multiple external distributed storage systems. It is critical for Alluxio to be able to store and serve the metadata of all files and directories from all mounted external storage both at scale and at speed.
This talk shares our design, implementation, and optimization of Alluxio metadata service (master node) to address the scalability challenges.

Tags: alluxio engineering, distributed systems, grpc, metadata service, office hour, raft, rocksdb

Scalable and Highly-available Distributed File System Metadata Service Using gRPC, RocksDB and RAFT

Community Online Office Hour * April 7, 2020

It is critical for Alluxio to be able to store and serve the metadata of all files and directories from all mounted external storage both at scale and at speed. This talk shares our design, implementation, and optimization of Alluxio metadata service (master node) to address the scalability challenges.

Testing Distributed System at Scale for the Cost of a Large Pizza on AWS

February 25, 2020

Building distributed systems is no small feat. Software testing is just one of many critical practices that engineers who build these systems need to utilize to ensure the quality and usability of their software. For distributed systems, scaling out testing frameworks to ensure that enterprises who run our in highly distributed environments is a complicated (and expensive task!)

Tags: aws, distributed systems, office hour, scale, testing

Testing Distributed System at Scale for the Cost of a Large Pizza on AWS

Community Online Office Hour * February 25, 2020

Implementing a Secure Plug-and-play Distributed File System Service Using Alluxio in Baidu

September 17, 2019 By Zhihong Zhang

In this article, you will learn how to incorporate Alluxio to implement a unified distributed file system service as well as how to add extensions on top of Alluxio including customized authentication schemes and UDF (user-defined functions) on Alluxio files.

Tag: distributed systems