What can I do to speed up analytics performance on remote data?

Background Today’s advanced analytics applications run on more datasets that ever before. The locations of where data “lands” is becoming more dispersed. And the separation of compute and storage in modern environments lends well to running on these distributed datasets. Data can be stored in a remote location from the compute, such as in a … Continued

O’Reilly AI Conference Keynote: Data Orchestration for AI, Big Data, and Cloud

Haoyuan Li’s keynote at O’Reilly Beijing discusses open source data orchestration and the value of leveraging Alluxio with rising trends driving the need for a new architecture. Four big trends driving this need: Separation of compute & storage, hybrid-multi cloud environments, rise of object store and self-service data across the enterprise.

Tags: , , , , , , , , , , ,

Running Spark & Alluxio in Kubernetes

The data orchestration layer bridging the gap between data locality with improved performance and data accessibility for analytics workloads in Kubernetes, and enables portability across storage providers.
An overview of Alluxio and the cloud use case with Spark in Kubernetes. Learn how to set up Alluxio and Spark to run in Kubernetes.

Tags: , , , , , , , , , , , ,

Alluxio at Beijing Meetup

Haoyuan Li presents at Beijing Meetup on open source data orchestration and the value of leveraging Alluxio with rising trends driving the need for a new architecture. Four big trends driving this need: Separation of compute & storage, hybrid-multi cloud environments, rise of object store and self-service data across the enterprise.

Tags: , , , , , , , , ,

RocksDB Meetup at Twitter

Bay Area Meetup *

Twitter SF is hosting 2019’s half yearly RocksDB Meetup with speakers from Twitter, Facebook and the community on July 11th.

Enabling Big Data and AI workloads on the Object Store at DBS Bank

Strata Data Conference New York *

In this presentation, Vitaliy Baklikov from DBS Bank and Dipti Borkar from Alluxio will share how DBS Bank has built a modern big data analytics stack leveraging an object store as persistent storage even for data-intensive workloads and how it uses Alluxio to orchestrate data locality and data access for Spark workloads. In addition, deploying Alluxio to access data, solves many challenges that cloud deployments bring with separated compute and storage.

Unified Big Data Analytics – Any stack, Any Cloud

Boston Meetup *

This presentation focuses on how Alluxio helps the big data analytics stack to be cloud-native. The trending Cloud object storage systems provide more cost-effective and scalable storage solutions but also different semantics and performance implications compared to HDFS. Applications like Spark or Presto will not benefit from the node-level locality or cross-job caching when retrieving data from the cloud object storage. Deploying Alluxio to access cloud solves these problems because data will be retrieved and cached in Alluxio instead of the underlying cloud or object storage repeatedly.