This webinar gives shows how to set up EMR Spark and Hive with Alluxio to seamlessly read/write to your S3 data lake, along with performance benefits.
This webinar gives a quick overview of Alluxio and the use cases it powers for Spark/Presto in Kubernetes and how to set up to run in Kubernetes.
We will introduce the key new features and enhancements such as: Support for hyper-scale data workloads, Machine learning and deep learning workloads, and Better storage abstraction.
This webinar highlights a simple solution is to run Spark on Alluxio as a distributed cache for S3. Alluxio stores data in memory close to Spark, providing high performance, in addition to providing data accessibility and abstraction for deployments in both public and hybrid clouds.
This webinar reviews: The observation and analysis of trends of separation of Storage and Compute in Big Data ecosystem; Why and how to build a new data access layer between compute and storage in this data stack; Alluxio open source: history, overview, design, and architecture; Production Use case with Spark, Presto, Tensorflow and etc; A demo of running Presto on Alluxio on S3
Joint webinar – Mesosphere DC/OS is a production-proven platform that powers both modern app components – containers and data services – so businesses can accelerate time to market with confidence, and save. We have seen tremendous interest from users to be able to run Alluxio via DC/OS.