Unified Big Data Analytics – Any stack, Any Cloud

Boston Meetup *

This presentation focuses on how Alluxio helps the big data analytics stack to be cloud-native. The trending Cloud object storage systems provide more cost-effective and scalable storage solutions but also different semantics and performance implications compared to HDFS. Applications like Spark or Presto will not benefit from the node-level locality or cross-job caching when retrieving data from the cloud object storage. Deploying Alluxio to access cloud solves these problems because data will be retrieved and cached in Alluxio instead of the underlying cloud or object storage repeatedly.

Interactive Big Data Analytics with the Presto + Alluxio stack for the Cloud

Alluxio Tech Talk *

In this tech talk, we will introduce the Starburst Presto, Alluxio, and Cloud object store stack for building a highly-concurrent and low-latency analytics platform. This stack provides a strong solution to run fast SQL across multiple storage systems including HDFS, S3 and others in public cloud, hybrid cloud and multi cloud environments.

Efficient & Secure Big Data Analytics: Perspectives from Uber, Alibaba, & Alluxio

Seattle Meetup *

Over the past two decades, the Big Data stack has reshaped and evolved quickly with numerous innovations driven by the rise of many different open source projects and communities. In this meetup, speakers from Uber, Alibaba, and Alluxio will share best practices for addressing the challenges and opportunities in the developing data architectures using new and emerging open source building blocks. Topics include data format (ORC) optimization, storage security (HDFS), data format (Parquet) layers, and unified data access (Alluxio) layers.

Two Sigma Open Source Meetup

New York Meetup *

TSOS meetups focus on the open source projects that Two Sigma cares most about, from projects we generated in-house then open sourced to large external open source projects that we depend on to do our work. This time, Wenbo Zhao (Two Sigma) and Bin Fan (Alluxio) will be presenting on how Two Sigma uses Alluxio to make data-intensive compute independent of the storage beneath.

Presto on Alluxio: How Netease Games leveraged Alluxio to boost ad hoc SQL on HDFS

Netease Games is the operator for many popular online games in China like “World of Warcraft” and “Hearthstone”. Netease Games also has developed quite a few popular games on its own such as “Fantasy Westward Journey 2”, “Westward Journey 2”, “World 3”, “League of Immortals”. The strong growth of the business drives the demand to build and maintain a data platform handling a massive amount of data and delivering insights promptly from the data. Given our data scale, it is very challenging to support high-performance ad-hoc queries to the data with results generated in a timely manner.

Deploying Big Data Workloads on Object Storage Without Performance Penalty

As the amount of data being collected and analyzed by Enterprises continues to grow unabated, more attention is being placed on managing the cost of storing the data relative to performance. Hadoop provides a scalable and fast way of storing and analyzing data, however, the cost of storing data in Hadoop is typically higher compared to alternative technologies like Object Stores.