This Alluxio Meetup features a chance to interact with other Alluxio users and developers, as well as three talks. Thanks to our joint host Data Council!
Today when we create a Hive table, it is a common technique to partition the table across different values and ranges to improve query performance and reduce maintenance cost. However, Hive can not access a single table directly using a single query with the data of this Hive table across different mediums of storage and … Continued
Alluxio is a proud sponsor and exhibitor of Spark+AI Summit in San Francisco.
What’s Spark+AI Summit? It’s the world’s largest conference that is focused on Apache Spark – Alluxio’s older cousin open source project from the same lab (UC Berkeley’s AMPLab – now RISElab).
Enterprises are increasingly looking towards object stores to power their big data & machine learning workloads in a cost-effective way. The combination of SwiftStack and Alluxio together, enables users to seamlessly move towards a disaggregated architecture.
During the past several years, Spark has significantly changed the landscape of big data computing. It improves performance of various applications dramatically. However, in certain Spark use cases, the bottleneck is in the I/O stack. In this talk, we will introduce Tachyon, a distributed memory-centric storage system. In addition, we will talk about several production use cases where Tachyon further improves Spark applications’ performance by orders of magnitude.
A few months ago, Baidu deployed Alluxio to accelerate its big data analytics workload. Bin Fan and Haojun Wang explain why Baidu chose Alluxio, as well as the details of how they achieved a 30x speedup with Alluxio in their production environment with hundreds of machines. Based on the success of the big data analytics engine, Baidu is currently expanding the Alluxio and Spark infrastructure to accelerate other applications, such as machine learning.
Big data ecosystem is moving with massive energy, customers are from healthcare, retail, transportation, and other fields are benefiting significantly from the business insights derived. As the data growth continues, storage technologies and distributed memory systems are becoming even more important for real time decision making and insight discovery. Intel is excited to work with developer communities on Alluxio and to optimize Alluxio solutions on Intel platform. In this talk, Ziya will discuss Intel’s optimization work in the area, open source contribution and industry use cases.
The goal is to make Alluxio accessible to an even wider set of users through a focus on security, new language bindings, and further increased stability. In addition, the team is working on new APIs to allow applications to access data more efficiently and manage data across different under storage systems.
In the active community development of the past year, Alluxio has greatly improved its read and write performance, scalability and user experience. In addition, in terms of functionality, Alluxio has added a number of new features, such as scalable tiered storage, transparent UFS data reading and writing, unified namespaces, and more. These features bring more value to Alluxio users and more efficient and convenient cluster storage management.