Guardant Health: Fast, scalable, data processing with Alluxio, Mesos, and Minio

Alluxio and Mesos Joint Meetup *

Speed is usually a key factor when analyzing large amounts of data. Alluxio enables analytics applications, such as Apache Spark, to retrieve stored data at memory speeds. DC/OS makes it easy to deploy distributed programs (such as Alluxio and Spark) and containers across large clusters.
In this talk, we will first discuss the development of the DC/OS Alluxio package, which deploys Alluxio on top of DC/OS, and then then demo the deployment a complete analytics stack, both with and without Alluxio, in order to see the benefits Alluxio provides.

Alluxio Exploration And Application Practice Meetup

Beijing Meetup *

In this issue, the Drip Technology Salon and the Alluxio community invited the core engineers of Didi Travel, Alluxio, Kyligence, JD.com, and Tencent to revolve around Alluxio’s position and design philosophy in the big data ecosystem, architectural features, latest developments, and well-known The company’s production-level environmental application exploration and practice, as well as the experience in the use of the process and other topics, and in-depth participants to share.

Accelerating Spark Workloads in a Mesos Environment with Alluxio

MesosCon Europe 2017 *

Using Alluxio, a memory speed virtual distributed storage system, deployed on Mesos enables connecting any compute framework, such as Apache Spark, to storage systems via a unified namespace. Alluxio enables applications to interact with any data at memory speed. Alluxio can eliminate the pains of ETL and data duplication, and enable new workloads across all data. Gene will discuss the architecture of Mesos, Spark and Alluxio to achieve an optimal architecture for enterprises.

Best Practices For Using Apache Spark With Alluxio

Spark Summit Europe 2017 *

Many organizations and deployments use Alluxio with Apache Spark, and some of them scale out to over PB’s of data. Alluxio can enable Spark to be even more effective, in both on-premise deployments and public cloud deployments. Alluxio bridges Spark applications with various storage systems and further accelerates data intensive applications. In this talk, we briefly introduce Alluxio, and present different ways how Alluxio can help Spark jobs. We discuss best practices of using Alluxio with Spark, including RDDs and DataFrames, as well as on-premise deployments and public cloud deployments.

Apache Spark Pipelines in the Cloud with Alluxio

Spark Summit Europe 2017 *

In this talk, we discuss how Alluxio can be deployed and used with a Spark data processing pipeline in the cloud. We show how pipeline stages can share data with Alluxio memory for improved performance benefits, and how Alluxio can improves completion times and reduces performance variability for Spark pipelines in the cloud.

Using Alluxio (formerly Tachyon) as a fault-tolerant pluggable optimization component to compute frameworks of JD system

Strata London *

Alluxio has run in JD.com’s production environment on 100 nodes for six months. Mao Baolong, Yiran Wu, and Yupeng Fu explain how JD.com uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization component. To give just one example, one framework, JDPresto, has seen a 10x performance improvement on average. This work has also extended Alluxio and enhanced the syncing between Alluxio and HDFS for consistency.

Beijing Meetup: Talks from Sogou, Qiniu, JD.com & Alluxio

Beijing Meetup *

The future is the era of data, and the abstraction of efficient management, storage, and access to data is undoubtedly the cornerstone of this era. Open source distributed virtual data system Alluxio is dedicated to providing simple and efficient data abstraction, convenient data sharing and high-speed I/O for big data, machine learning, and artificial intelligence, while keeping applications and data persistent and providing rich Storage system selection. After several years of development, Alluxio was developed from a prototype of a research project involving only a few Ph.D. students and researchers in the AMPLab at the University of California, Berkeley, to more than 800 code contributors (Alluxio 1.8 release data), and deployed in Tencent. Baidu, JD, Two-Sigma, Barclays Bank and other hundreds of Chinese and foreign industry leaders in the production environment, become an important part of the data platform and data infrastructure.

Shanghai Meetup: Talks from Ctrip, Qiniu, Intel & Alluxio

Shanghai Meetup *

The future is the era of data, and the abstraction of efficient management, storage, and access to data is undoubtedly the cornerstone of this era. Open source distributed virtual data system Alluxio is dedicated to providing simple and efficient data abstraction, convenient data sharing and high-speed I/O for big data, machine learning, and artificial intelligence, while keeping applications and data persistent and providing rich Storage system selection.
After several years of development, Alluxio was developed from a prototype of a research project involving only a few Ph.D. students and researchers in the AMPLab at the University of California, Berkeley, to more than 800 code contributors (Alluxio 1.8 release data), and deployed in Tencent. Baidu, JD, Two-Sigma, Barclays Bank and other hundreds of Chinese and foreign industry leaders in the production environment, become an important part of the data platform and data infrastructure.