Past, Present and Future of Alluxio [Chinese]

Shanghai Meetup *

The Alluxio project has greatly improved system performance, Scalability and user experience, and added a series of new features, including scalable tiered storage, transparent UFS data reading and writing, unified namespaces, and more. Easy to use with Alluxio. At the same time, the Alluxio ecosystem has expanded to support different storage systems and computing frameworks. Alluxio now supports a variety of storage systems, including Amazon S3, Google Cloud Storage, Gluster, Ceph, HDFS, NFS and OpenStack Swift, as well as big data processing frameworks such as Spark, MapReduce, Flink and more. These integrations allow Alluxio to manage and help with more and more complex data.

Alluxio (formerly Tachyon): The journey thus far and the road ahead

Strata+Hadoop World New York *

The goal is to make Alluxio accessible to an even wider set of users through a focus on security, new language bindings, and further increased stability. In addition, the team is working on new APIs to allow applications to access data more efficiently and manage data across different under storage systems.

Alluxio: Unifying APIs, Accelerating ML, & Enabling Cloud Architectures

Bay Area Meetup *

Using intermediate APIs means developers can learn just one framework and still access features offered by different technologies. It means writing job logic only once and being able to test it easily on a new underlying service with no effort. Not only is modularity a win for users but it means creators of execution frameworks and storage systems can focus on performance and capability without having to worry about API maintenance.

How to Use Alluxio to improve Spark and Hadoop HDFS Performance of Data Access and System Reliability [Chinese]

Database Technology Conference China 2017 *

China Unicom is one of the five largest telecom operators in the world. China Unicom’s booming business in 4G and 5G networks has to serve an exploding base of hundreds of millions of smartphone users. This unprecedented growth brought enormous challenges and new requirements to the data processing infrastructure. The previous generation of its data processing system was based on IBM midrange computers, Oracle databases, and EMC storage devices. This architecture could not scale to process the amounts of data generated by the rapidly expanding number of mobile users. Even after deploying Hadoop and Greenplum database, it was still difficult to cover critical business scenarios with their varying massive data processing requirements. The complicated the architecture of its incumbent computing platform created a lot of new challenges to effectively use resources.

Accelerating Spark Workloads in an Apache Mesos Environment with Alluxio

MesosCon North America 2017 *

Using Alluxio, an open-source memory speed virtual distributed storage system, deployed on Mesos enables connecting any compute framework, such as Apache Spark, to storage systems via a unified namespace. Alluxio enables applications to interact with any data at memory speed. Alluxio can eliminate the pains of ETL and data duplication, and enable new workloads across all data. Adit will discuss the architecture of Mesos, Spark and Alluxio to achieve an optimal architecture for enterprises.

Apache Kylin And Alluxio Meetup

Shanghai Meetup *

With the development of online services and clusters, the HDFS NameNode becomes a performance bottleneck of the HDFS cluster, which is not conducive to the horizontal expansion of the cluster.
The community’s Federation + viewFs solution solves the problem of horizontal scaling of HDFS, but the configuration of this solution is implemented on the client side, which is not conducive to the operation and management of large-scale clusters. Using Alluxio as a unified portal for multiple HDFS clusters, operation and maintenance management is convenient, and distributed cache capability is provided.

Accelerating Spark Workloads in a Mesos Environment with Alluxio

MesosCon Europe 2017 *

Using Alluxio, a memory speed virtual distributed storage system, deployed on Mesos enables connecting any compute framework, such as Apache Spark, to storage systems via a unified namespace. Alluxio enables applications to interact with any data at memory speed. Alluxio can eliminate the pains of ETL and data duplication, and enable new workloads across all data. Gene will discuss the architecture of Mesos, Spark and Alluxio to achieve an optimal architecture for enterprises.