Join us for our 8th Alluxio Day virtual community event featuring speakers from Apache Iceberg and WeRide.
Alluxio meetups, conferences, events and more
The latest Alluxio meetups, webinars, conferences and more
During the past several years, Spark has significantly changed the landscape of big data computing. It improves performance of various applications dramatically. However, in certain Spark use cases, the bottleneck is in the I/O stack. In this talk, we will introduce Tachyon, a distributed memory-centric storage system. In addition, we will talk about several production use cases where Tachyon further improves Spark applications’ performance by orders of magnitude.
In the presentation, we will explore several potential industry use cases enabled by the new features. One-click cluster deployment enables users to experiment and prototype with Tachyon on AWS, launching not only Tachyon but also the computation framework and storage system of their choice. Mounting of multiple under storage systems and transparent naming enables more exciting use cases for Tachyon users.
Check out our new blog post: “Internet of Things: Are We There Yet? (The 2016 IoT Landscape)”: The Internet of Things is all about data!
A few months ago, Baidu deployed Alluxio to accelerate its big data analytics workload. Bin Fan and Haojun Wang explain why Baidu chose Alluxio, as well as the details of how they achieved a 30x speedup with Alluxio in their production environment with hundreds of machines. Based on the success of the big data analytics engine, Baidu is currently expanding the Alluxio and Spark infrastructure to accelerate other applications, such as machine learning.
Calvin Jia and Jiri Simsa explain how the current Alluxio tiered storage can be easily configured to use memory, SSDs, and hard drives in different tiers. Alluxio users and administrators do not have to manually migrate the data because data in Alluxio is managed transparently between all the configured tiers, similar to the way the CPU manages L1, L2, and lower-level caches. Meanwhile, Alluxio also provides users fine-grained control of manipulating data to plug in their own data-management strategies; users can also pin files in Alluxio to a specific storage or specify a TTL to files. Calvin and Jiri also describe the interface for managing heterogeneous data sources into the Alluxio namespace, which takes advantage of Alluxio’s ability to interoperate with different underlying storage systems such as HDFS, S3, GlusterFS, or Swift.
Big data ecosystem is moving with massive energy, customers are from healthcare, retail, transportation, and other fields are benefiting significantly from the business insights derived. As the data growth continues, storage technologies and distributed memory systems are becoming even more important for real time decision making and insight discovery. Intel is excited to work with developer communities on Alluxio and to optimize Alluxio solutions on Intel platform. In this talk, Ziya will discuss Intel’s optimization work in the area, open source contribution and industry use cases.
In this talk, Haoyuan Li, co-creator of Tachyon (and a founding committer of Spark) and CEO of Tachyon Nexus will explain how the next wave of innovation in storage will be driven by separating the functional layer from the persistent storage layer, and how memory-centric architecture through Tachyon is making this possible. Li will describe the future of distributed file storage and highlight how Tachyon supports specific use cases.
The goal is to make Alluxio accessible to an even wider set of users through a focus on security, new language bindings, and further increased stability. In addition, the team is working on new APIs to allow applications to access data more efficiently and manage data across different under storage systems.
In the active community development of the past year, Alluxio has greatly improved its read and write performance, scalability and user experience. In addition, in terms of functionality, Alluxio has added a number of new features, such as scalable tiered storage, transparent UFS data reading and writing, unified namespaces, and more. These features bring more value to Alluxio users and more efficient and convenient cluster storage management.