office hour Archives | Page 4 of 4

Building a Cloud Native Stack with EMR Spark, Alluxio, and S3

Alluxio Community Office Hour * August 27, 2019

Learn how to set up EMR Spark with Alluxio so Spark jobs can seamlessly read from and write to S3. See the performance comparison between Spark on S3 with Spark, and Alluxio on S3.

Community Office Hour: Running Spark & Alluxio in Kubernetes

June 25, 2019 by Bin Fan & Adit Madan

The data orchestration layer bridging the gap between data locality with improved performance and data accessibility for analytics workloads in Kubernetes, and enables portability across storage providers.
An overview of Alluxio and the cloud use case with Spark in Kubernetes. Learn how to set up Alluxio and Spark to run in Kubernetes.

Tags: analytics, apache spark, compute, compute storage separation, data, data orchestration, hybrid cloud, kubernetes, locality, multi cloud, office hour, spark, storage

Running Spark & Alluxio in Kubernetes

Alluxio Community Office Hour * June 25, 2019

The latest advances in container orchestration by Kubernetes bring cost savings and flexibility to compute workloads in public or hybrid cloud environments. On the other hand, it introduces new challenges such as how to move data to compute efficiently, how to unify data across multiple or remote clouds, how to co-locate data with compute and many more. Alluxio approaches these problems in a new way. It helps elastic compute workloads realize the true benefits of the cloud, while bringing data locality and data accessibility to workloads orchestrated by Kubernetes

Running Presto with Alluxio on Amazon EMR

Alluxio Community Office Hour - May * May 21, 2019

Many organizations are leveraging EMR to run big data analytics on public cloud. However, reading and writing data to S3 directly can result in slow and inconsistent performance. Alluxio is a data orchestration layer for the cloud, and in this use case it caches data for S3, ensuring high and predictable performance as well as reduced network traffic.

Alluxio for Hybrid Cloud | HDFS and AWS S3 demo

Alluxio Community Office Hour * April 30, 2019

Alluxio can help data scientists and data engineers interact with different storage systems in a hybrid cloud environment. Using Alluxio as a data access layer for Big Data and Machine Learning applications, data processing pipelines can improve efficiency without explicit data ETL steps and the resulting data duplication across storage systems.

Tag: office hour