TSOS meetups focus on the open source projects that Two Sigma cares most about, from projects we generated in-house then open sourced to large external open source projects that we depend on to do our work. This time, Wenbo Zhao (Two Sigma) and Bin Fan (Alluxio) will be presenting on how Two Sigma uses Alluxio to make data-intensive compute independent of the storage beneath.
This webinar reviews: The observation and analysis of trends of separation of Storage and Compute in Big Data ecosystem; Why and how to build a new data access layer between compute and storage in this data stack; Alluxio open source: history, overview, design, and architecture; Production Use case with Spark, Presto, Tensorflow and etc; A demo of running Presto on Alluxio on S3
Over the past two decades, the Big Data stack has reshaped and evolved quickly with numerous innovations driven by the rise of many different open source projects and communities. In this meetup, speakers from Uber, Alibaba, and Alluxio will share best practices for addressing the challenges and opportunities in the developing data architectures using new and emerging open source building blocks. Topics include data format (ORC) optimization, storage security (HDFS), data format (Parquet) layers, and unified data access (Alluxio) layers.