Zookeeper vs Raft: Stateful Distributed Coordination with HA and Fault Tolerance

Big Data Bellevue & Cloudy With a Chance of Data Meetup

October 20, 2022

Distributed systems are made up of many components such as authentication, a persistence layer, stateless services, load balancers, and stateful coordination services. These coordination services are central to the operation of the system, performing tasks such as maintaining system configuration state, ensuring service availability, name resolution, and storing other system metadata. Given their central role in the system it is essential that these systems remain available, fault tolerant and consistent. By providing a highly available file system-like abstraction as well as powerful recipes such as leader election, Apache Zookeeper is often used to implement these services. This talk will go over a generic example of stateful coordination service moving from Zookeeper to Raft.

Meetup Groups

Big Data Bellevue: https://www.meetup.com/big-data-bellevue-bdb/

Cloudy With a Chance of Data: https://www.meetup.com/meetup-datascience/


Video:

Presentation Slides:


Speakers:

David Zhu is a software engineer manager at Alluxio. At Alluxio, David mainly focuses on metadata syncing, job service and end-to-end performance benchmarking and optimizations. Prior to that, David completed his Ph.D. from UC Berkeley’s AMPLab, with a focus on distributed data management systems and operating systems for the data center.

Jasmine Wang is the Community Manager and DevRel at Alluxio. A formal national debate champion turned into a traveling yoga teacher with a strong passion in building teams and being the bridge at early startups in Silicon Valley. Currently building the Alluxio open source community, responsible for community marketing, developer relations, developer experience, and cross-community collaborations at Alluxio.