Alluxio is a proud sponsor and exhibitor at the AWS Summit in New York. If you weren’t able to attend, here are the highlights
Haoyuan Li’s keynote at O’Reilly Beijing discusses open source data orchestration and the value of leveraging Alluxio with rising trends driving the need for a new architecture. Four big trends driving this need: Separation of compute & storage, hybrid-multi cloud environments, rise of object store and self-service data across the enterprise.
In this presentation, Vitaliy Baklikov from DBS Bank and Dipti Borkar from Alluxio will share how DBS Bank has built a modern big data analytics stack leveraging an object store as persistent storage even for data-intensive workloads and how it uses Alluxio to orchestrate data locality and data access for Spark workloads. In addition, deploying Alluxio to access data, solves many challenges that cloud deployments bring with separated compute and storage.
A new generation of open source big data, represented by Alluxio, born at the University of California at Berkeley, looks at this issue. Different from systems such as designing storage tight coupling to achieve low-cost reliable storage HDFS, by providing a virtual data storage layer defined and implemented by software for data applications, abstracting and integrating cloudy, hybrid cloud, multi-data center and other environments The underlying files and objects, and through intelligent workload analysis and data management, make data close to computing and provide data locality, big data and machine learning applications can be achieved with the same performance and lower cost.
In this talk, we will focus on Alluxio design, its architecture, data flow and metadata flow. We will dive into the choices in its design space and share the experiences when implementing features like data tiering, storage options and cache eviction policies. We will also share our lessons in design, implementation and operation when working to build an open source distributed storage systems with 900 contributors for 5+ years.
What’s Spark+AI Summit? It’s the world’s largest conference that is focused on Apache Spark – Alluxio’s older cousin open source project from the same lab (UC Berkeley’s AMPLab – now RISElab).
Open source software always plays critical role in software development. From Linux kernel to TensorFlow, it drives a lot of awesome projects which created trend and led direction of technology.
We are pleased to have several experts, Reynold Xin, Dongxu Huang, Qing Han, Bin Fan, Amelia Wong, etc. who will share the technology and stories on their successful open source project.
During the past several years, Spark has significantly changed the landscape of big data computing. It improves performance of various applications dramatically. However, in certain Spark use cases, the bottleneck is in the I/O stack. In this talk, we will introduce Tachyon, a distributed memory-centric storage system. In addition, we will talk about several production use cases where Tachyon further improves Spark applications’ performance by orders of magnitude.
Check out our new blog post: “Internet of Things: Are We There Yet? (The 2016 IoT Landscape)”: The Internet of Things is all about data!