Learn how to set up EMR Spark with Alluxio so Spark jobs can seamlessly read from and write to S3. See the performance comparison between Spark on S3 with Spark, and Alluxio on S3.
Alluxio’s first cloud, data & orchestration Austin meetup featuring talks and demos on efficient data engineering with Apache Spark, Hive and Alluxio on S3.
Background Today’s advanced analytics applications run on more datasets that ever before. The locations of where data “lands” is becoming more dispersed. And the separation of compute and storage in modern environments lends well to running on these distributed datasets. Data can be stored in a remote location from the compute, such as in a … Continued
This webinar gives a quick overview of Alluxio and the use cases it powers for Spark/Presto in Kubernetes and how to set up to run in Kubernetes.
Alluxio is an open-source data orchestration system widely used to speed up data-intensive workloads in the cloud. Alluxio v2.0 introduced Replicated Async Write to allow users to complete writes to Alluxio file system and return quickly with high application performance, while still providing users with peace of mind that data will be persisted to the chosen under storage like S3 in the background.
Joint meetup in Hangzhou discusses: An introduction to new features of big data storage system Alluxio and optimization of cache performance, Practice & exploration of Spark & Alluxio, and the Interactive query system Impala.
Welcome to the first event of the Cloud, Data, & Orchestration Austin Meetup! This meetup will feature two talks and an opportunity to engage with other data engineers, developers, and Alluxio users. Thanks to Bazaarvoice for hosting!
We will introduce the key new features and enhancements such as: Support for hyper-scale data workloads, Machine learning and deep learning workloads, and Better storage abstraction.
Alluxio is a proud sponsor and exhibitor at the AWS Summit in New York. If you weren’t able to attend, here are the highlights