This webinar reviews: The observation and analysis of trends of separation of Storage and Compute in Big Data ecosystem; Why and how to build a new data access layer between compute and storage in this data stack; Alluxio open source: history, overview, design, and architecture; Production Use case with Spark, Presto, Tensorflow and etc; A demo of running Presto on Alluxio on S3
Over the past two decades, the Big Data stack has reshaped and evolved quickly with numerous innovations driven by the rise of many different open source projects and communities. In this meetup, speakers from Uber, Alibaba, and Alluxio will share best practices for addressing the challenges and opportunities in the developing data architectures using new and emerging open source building blocks. Topics include data format (ORC) optimization, storage security (HDFS), data format (Parquet) layers, and unified data access (Alluxio) layers.
Learn more about the practice of Alluxio in AVA deep learning platform, Ctrip big data platform, and Sogou.
MesosCon Europe 2017 – Gene Pang discusses the architecture of Mesos, Spark and Alluxio to achieve an optimal architecture for enterprises.
Lucene/SOLR Revolution 2017 – Timothy Potter, Lucidworks introduces Alluxio, the fastest growing open source project in the big data ecosystem, and shows how to leverage it for optimizing Solr performance. Basic integration scenarios and performance comparisons are also presented.
Spark Summit SF 2017 – We briefly introduce Alluxio and present different ways Alluxio can help Spark jobs, along with best practices. We also discuss how Alluxio can be deployed and used with a Spark data processing pipeline in the cloud.
Global Big Data Conference 2017 – In the past year, the Alluxio project experienced significant improvement in performance and scalability and was extended with key new features including tiered storage, transparent naming, and unified namespace
Calvin Jia introduces Alluxio, explain how Alluxio can help Spark be more effective, show benchmark results with Spark RDDs and DataFrames, and describe production deployments with both Alluxio and Spark working together.
In this talk, we briefly introduce Alluxio, present several ways how Alluxio can help Spark be more effective, show benchmark results with Spark RDDs & DataFrames, and describe production deployments with both Alluxio and Spark working together.