Over the years of working in the big data and machine learning space, we frequently hear from data engineers that the biggest obstacle to extracting value from data is being able to access the data efficiently. Data silos, isolated islands of data, are often viewed by data engineers as the key culprit or public enemy №1. There have been many attempts to do away with data silos, but those attempts themselves have resulted in yet another data silo, with data lakes being one such example. Rather than attempting to eliminate data silos, we believe the right approach is to embrace them.
Tag: data orchestration
Alluxio is a proud sponsor and exhibitor of Spark+AI Summit in San Francisco.
What’s Spark+AI Summit? It’s the world’s largest conference that is focused on Apache Spark – Alluxio’s older cousin open source project from the same lab (UC Berkeley’s AMPLab – now RISElab).
Alluxio is an open source software solution that connects analytics applications to heterogeneous data sources through a data orchestration layer that sits between compute and storage.
Joint webinar – Mesosphere DC/OS is a production-proven platform that powers both modern app components – containers and data services – so businesses can accelerate time to market with confidence, and save. We have seen tremendous interest from users to be able to run Alluxio via DC/OS.
Calvin Jia introduces Alluxio, explain how Alluxio can help Spark be more effective, show benchmark results with Spark RDDs and DataFrames, and describe production deployments with both Alluxio and Spark working together.
Strata+Hadoop World 2016 – Baidu deployed Alluxio to accelerate its big data analytics workload. Bin Fan and Haojun Wang explain why Baidu chose Alluxio, as well as the details of how they achieved a 30x speedup with Alluxio in their production environment with hundreds of machines. Based on the success of the big data analytics engine, Baidu is currently expanding the Alluxio and Spark infrastructure to accelerate other applications, such as machine learning.