ALLUXIO COMMUNITY NEWSLETTER
Join us for a day of all things data engineering at the first Data Orchestration Summit! Join the open source community and hear tech talks by industry experts from Apple, Walmart, Netflix, AWS, Rakuten, Tencent, Google, Baidu, Alibaba, and more. You can also meet other data engineers over lunch and happy hour, and go home with learnings, swag, and a chance to win the latest iPad Pro! Get your early bird tickets now!
Moving From Apache Thrift to gRPC: A Perspective From Alluxio
As part of the Alluxio 2.0 release, we have moved our RPC framework from Apache Thrift to gRPC. In this article, we talk about the reasons behind this change, some lessons we learned along the way, and performance tuning tips.
Designing for the cloud: What Today’s Data Engineer Should Be Considering When Building Their Stack
Compute, containers, storage, data movement, performance, network – skills are increasingly needed across the broader stack. This whitepaper offers design principles and high priority elements of the stack that a data engineer should think about.
Implementing a Secure Plug-and-play Distributed File System Service Using Alluxio in Baidu
This article describes how Baidu creates a secure, modular and extensible distributed file system service in project Pingo based on Alluxio. In this article, you will learn how to incorporate Alluxio to implement a unified distributed file system service as well as how to add extensions on top of Alluxio including customized authentication schemes and UDF (user-defined functions) on Alluxio files.
Accelerate Spark and Hive Jobs on AWS S3 by 10x with Alluxio Tiered Storage
In this article, Thai Bui from Bazaarvoice describes how Bazaarvoice leverages Alluxio to build a tiered storage architecture with AWS S3 to maximize performance and minimize operating costs on running Big Data analytics on AWS EC2.
Building a Large-scale Interactive SQL Query Engine using Presto and Alluxio in JD.com
Largest online and offline retailer in China, JD.com is running a data platform with more than 40,000 nodes, running more than 1 million jobs per day, managing over 650PB of data. Read more on how JD built this interactive OLAP platform with open source technologies Presto and Alluxio achieving 10X performance improvement for ad-hoc query latency. Read more.
Get Started with Alluxio
September Online Office Hour: Accelerating Hive with Alluxio on S3
Join us on October 1st to learn about Bazzarvoice’s use case and how to set up Hive with Alluxio to seamlessly read/write to S3. Sign up here.
AWS S3 + Alluxio + Presto = ❤️, The Ryte Use Case
Ryte-Platform is built with a scalable architecture to support a heavy load and make it possible for customers to drill-down from a high-level overview into the last byte of their websites. Hear why & how Ryte solves some challenging technical issues, improve speed, and reduce costs of AWS EMR Hadoop & Presto here.
CNCF Webinar Series – Feeding the Kubernetes Beast: Bringing Locality Back to Data Workloads
Learn about a new approach of bringing data locality to data-intensive compute workloads in Kubernetes environments. See a demo on how to set up and run Apache Spark with Alluxio in Kubernetes. Register now!
Check out our partner event Data Council New York 2019
Don’t miss our partner event on November 12-13! Join 400+ data engineers, data scientists, data analysts and CTOs to listen to 50+ speakers from companies like Google, Facebook, Netflix, Twitter, Datadog, Stitch Fix and many more. Register using code Alluxio150 for a $150 discount code on regular tickets.
[On-Demand] August Online Office Hour: Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Missed last month’s office hour? Watch the on-demand video here.
Using or trying Alluxio? Share your thoughts and we’ll send you some swag!