This is a recap of the Two Sigma and Alluxio joint meetup hosted in New York. Two Sigma is a leading hedge fund that leverages cutting edge technology to train their models with petabytes of data in on-premise storage. Special thanks to Two Sigma for hosting. Here are the slides from the presentation.
In this meetup, Bin Fan from Alluxio and Wenbo Zhao from Two Sigma co-presented a reference stack (running Alluxio as a data access layer for Apache Spark) that can enable independent and separated compute and storage for big data and machine learning workloads. Two Sigma’s use case is a great example of the benefits of this reference stack for bursting machine learning computation to the public cloud while still being able to access data stored on-premise efficiently. Their data scientists want to leverage the public cloud as a scalable and elastic computation resource to speed up the end-to-end model training process. By using Alluxio as the data access layer co-located with compute in the cloud, their researchers achieved 10x faster end to end processing, which enables them to perform more iterations on their models.
We had a great time interacting with the audience on the East coast and we look forward to the next NYC event!
If you are interested in hosting or presenting at a future event, please contact us at email@example.com.