Unified Namespace and Tiered Storage in Alluxio

Strata+Hadoop World San Jose *

Calvin Jia and Jiri Simsa explain how the current Alluxio tiered storage can be easily configured to use memory, SSDs, and hard drives in different tiers. Alluxio users and administrators do not have to manually migrate the data because data in Alluxio is managed transparently between all the configured tiers, similar to the way the CPU manages L1, L2, and lower-level caches. Meanwhile, Alluxio also provides users fine-grained control of manipulating data to plug in their own data-management strategies; users can also pin files in Alluxio to a specific storage or specify a TTL to files. Calvin and Jiri also describe the interface for managing heterogeneous data sources into the Alluxio namespace, which takes advantage of Alluxio’s ability to interoperate with different underlying storage systems such as HDFS, S3, GlusterFS, or Swift.

Fast big data analytics and machine learning using Alluxio and Spark in Baidu

Strata+Hadoop World San Jose *

A few months ago, Baidu deployed Alluxio to accelerate its big data analytics workload. Bin Fan and Haojun Wang explain why Baidu chose Alluxio, as well as the details of how they achieved a 30x speedup with Alluxio in their production environment with hundreds of machines. Based on the success of the big data analytics engine, Baidu is currently expanding the Alluxio and Spark infrastructure to accelerate other applications, such as machine learning.

Data Driven #46 (a FirstMark Event)

Data Driven NYC *

Check out our new blog post: “Internet of Things: Are We There Yet? (The 2016 IoT Landscape)”: The Internet of Things is all about data!

Alluxio (formerly Tachyon): Open Source Memory Speed Virtual Distributed Storage System

Data by the Bay San Francisco *

The goal is to make Alluxio accessible to an even wider set of users through a focus on security, new language bindings, and further increased stability. In addition, the team is working on new APIs to allow applications to access data more efficiently and manage data across different under storage systems.

Alluxio: Solving the Framework-Storage Gap in Big Data

DSI Conference San Mateo *

In this talk, Haoyuan Li, co-creator of Tachyon (and a founding committer of Spark) and CEO of Tachyon Nexus will explain how the next wave of innovation in storage will be driven by separating the functional layer from the persistent storage layer, and how memory-centric architecture through Tachyon is making this possible. Li will describe the future of distributed file storage and highlight how Tachyon supports specific use cases.

Alluxio (formerly Tachyon): The journey thus far and the road ahead

Strata+Hadoop World New York *

The goal is to make Alluxio accessible to an even wider set of users through a focus on security, new language bindings, and further increased stability. In addition, the team is working on new APIs to allow applications to access data more efficiently and manage data across different under storage systems.

Modern Software Architectures and Data Pipelines

Scalæ By The Bay San Francisco *

Throughout our four-year history, Scala and Scale By the Bay is leading the way on evangelizing and understansing modern software architectures. We have the best set of them here, including Akka, Kafka, Spark, Finagle, Lagom, and so on. How do they come together in a SMACK / MIND Stack? What are the best practices to follow and pitfalls to avoid? This panels of experienced practitioners will discuss and illuminate it all.

How Alluxio (formerly Tachyon) brings a 300x performance improvement to Qunar’s streaming processing

Strata+Hadoop World Singapore *

Alluxio is the first memory-speed virtual distributed storage system in the world. It unifies the interface between the various computing frameworks and under storages. Data access can be several magnitude faster because of Alluxio’s memory-centric architecture. In addition, Alluxio’s tiered storage, unified namespace, flexible file API, web UI, and command-line tools increase the usability in different application scenarios.
Qunar has been running Alluxio in production for over a year. Lei Xu explores how stream processing on Alluxio has led to a 16x performance improvement on average and 300x improvement at service peak time on workloads at Qunar.