storage Archives | Page 4 of 16

Speeding up I/O for Machine Learning ft Apple Case Study using TensorFlow, NFS, DC OS, & Alluxio

January 15, 2020

This talk will guide the audience on how Alluxio can greatly simplify the data preparation phase in with remote and possibly multiple data sources. We will share the lessons and benchmark from Bill Zhao an engineer led in Apple when building a Machine Learning platform using Tensorflow, NFS, DC/OS and Alluxio.

Tags: dc/os, machine learning, NFS, POSIX, storage, tensorflow

Alluxio Structured Data Management: Optimizing Structured Data with Alluxio

Alluxio Community Office Hour * January 14, 2020

In this office hour, we will go over an introduction and motivation of Alluxio Structured Data Management, an overview of the different services in Alluxio 2.1, and a demo using Alluxio Structured Data Management with Presto.

Improving Memory Utilization of Spark Jobs Using Alluxio

Alluxio Community Office Hour * November 26, 2019

This office hour shares a demo and compares two approaches, caching data directly in-memory into the Spark JVM versus storing data off-heap via an in-memory storage service like Alluxio

Improving Spark Memory Resource with Off-Heap In-Memory Storage

November 1, 2019 By Bin Fan and Adit Madan

In the previous tutorial ”Getting Started with Spark Caching using Alluxio in 5 Minutes”, we demonstrated how to get started with Spark and Alluxio. To share more thoughts and experiments on how Alluxio enhances Spark workloads, this article focuses on how Alluxio helps to optimize the memory utilization of Spark applications. For users who are … Continued

Alluxio – Data Orchestration for Analytics and AI in the Cloud

October 9, 2019

In this talk, we present: trends and challenges in the data ecosystem in cloud era; Data engineering in the cloud with data orchestration; Use cases of using tech stacks (Presto or Tensorflow) with Alluxio on S3.

Tags: aws s3, big data, cloud, data orchestration, hdfs, meetup, presto, spark, storage, tensorflow

Running Alluxio On HashiCorp Nomad

August 20, 2019 By Naveen K T

I recently worked on a PoC evaluating Nomad for a client. Since there were certain constraints limiting what was possible on the client environment, I put together something “quick” on my personal workstation to see what was required for Alluxio to play nice with Nomad.

Tag: storage