tensorflow Archives

Speeding up TensorFlow and PyTorch with Alluxio

September 9, 2021

The Alluxio core engineering team re-designed things to come up with a more efficient and transparent way for users to leverage data orchestration through the POSIX interface. This enables much better performance for ML workloads where data is accessed via the POSIX interface.

Tags: data orchestration, fuse, ml, performance, POSIX, pytorch, tensorflow

Efficient Model Training in the Cloud with Kubernetes, TensorFlow, and Alluxio

May 22, 2020 By Rong Gu (Nanjing University) and Yang Che (Alibaba)

A collaboration of Alibaba, Alluxio, and Nanjing University in tackling the problems of Deep Learning model training in the cloud. Our goal was to reduce the cost and complexity of data access for Deep Learning training in a hybrid environment, which resulted in over 40% reduction in training time and cost.

Alluxio Accelerates Deep Learning in Hybrid Cloud using Intel’s Analytics Zoo open source platform powered by oneAPI

April 27, 2020 By Bin Fan

This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed.

Speeding up I/O for Machine Learning ft Apple Case Study using TensorFlow, NFS, DC OS, & Alluxio

January 15, 2020

This talk will guide the audience on how Alluxio can greatly simplify the data preparation phase in with remote and possibly multiple data sources. We will share the lessons and benchmark from Bill Zhao an engineer led in Apple when building a Machine Learning platform using Tensorflow, NFS, DC/OS and Alluxio.

Tags: dc/os, machine learning, NFS, POSIX, storage, tensorflow

Speeding Up I/O for Machine Learning

Alluxio Global Online Meetup * January 15, 2020

tf.data: TensorFlow Input Pipeline

November 12, 2019

This talk covers an overview of the project and highlight best practices for creating performant input pipelines.

Tags: conference, data orchestration, data orchestration summit, tensorflow

Alluxio – Data Orchestration for Analytics and AI in the Cloud

October 9, 2019

In this talk, we present: trends and challenges in the data ecosystem in cloud era; Data engineering in the cloud with data orchestration; Use cases of using tech stacks (Presto or Tensorflow) with Alluxio on S3.

Tags: aws s3, big data, cloud, data orchestration, hdfs, meetup, presto, spark, storage, tensorflow

Tech Talk: Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage

July 17, 2019

The ever increasing challenge to process and extract value from exploding data with AI and analytics workloads makes a memory centric architecture with disaggregated storage and compute more attractive. This decoupled architecture enables users to innovate faster and scale on-demand. Enterprises are also increasingly looking towards object stores to power their big data & machine learning workloads in a cost-effective way. However, object stores don’t provide big data compatible APIs as well as the required performance.

In this webinar, the Intel and Alluxio teams will present a proposed reference architecture using Alluxio as the in-memory accelerator for object stores to enable modern analytical workloads such as Spark, Presto, Tensorflow, and Hive. We will also present a technical overview of Alluxio.

Tags: big data, compute storage separation, hive, intel, object stores, spark, tech talk, tensorflow

How do you run TensorFlow on a remote storage system?

Problem It becomes increasingly more popular among data scientists to train models based on frameworks like TensorFlow on a local server or cluster while using remote shared storages like S3 or Google Cloud Storage to store a massive amount of the input data. This stack provides high flexibility and cost efficiency, especially requires no dev-ops … Continued

Tag: tensorflow