Accelerate Spark workloads on S3

Alluxio Tech Talk *

Register for this tech talk to learn how to run EMR Spark on Alluxio as a distributed file system cache for S3.

How do you run TensorFlow on a remote storage system?

Problem It becomes increasingly more popular among data scientists to train models based on frameworks like TensorFlow on a local server or cluster while using remote shared storages like S3 or Google Cloud Storage to store a massive amount of the input data. This stack provides high flexibility and cost efficiency, especially requires no dev-ops … Continued

Recap: Spark+AI Summit 2019

Alluxio is a proud sponsor and exhibitor of Spark+AI Summit in San Francisco.
What’s Spark+AI Summit? It’s the world’s largest conference that is focused on Apache Spark – Alluxio’s older cousin open source project from the same lab (UC Berkeley’s AMPLab – now RISElab).

Alluxio for Hybrid Cloud | HDFS and AWS S3 demo

Alluxio Community Office Hour *

Alluxio can help data scientists and data engineers interact with different storage systems in a hybrid cloud environment. Using Alluxio as a data access layer for Big Data and Machine Learning applications, data processing pipelines can improve efficiency without explicit data ETL steps and the resulting data duplication across storage systems.

Spark+AI Summit SF 2019

SAIS 2019 *

What’s Spark+AI Summit? It’s the world’s largest conference that is focused on Apache Spark – Alluxio’s older cousin open source project from the same lab (UC Berkeley’s AMPLab – now RISElab).

Introduction to Alluxio 2.0 Preview

Alluxio Tech Talk *

Alluxio 2.0 is the most ambitious platform upgrade since the inception of Alluxio with greatly expanded capabilities to empower users to run analytics and AI workloads on private, public or hybrid cloud infrastructures leveraging valuable data wherever it might be stored. This preview release, now available for download, includes many advancements that will allow users to push the limits of their data-workloads in the cloud.

Interactive Big Data Analytics with the Presto + Alluxio stack for the Cloud

Alluxio Tech Talk *

In this tech talk, we will introduce the Starburst Presto, Alluxio, and Cloud object store stack for building a highly-concurrent and low-latency analytics platform. This stack provides a strong solution to run fast SQL across multiple storage systems including HDFS, S3 and others in public cloud, hybrid cloud and multi cloud environments.