In this meetup, Dipti and HY will present a new approach to hybrid analytical workloads using Alluxio, an open source data orchestration layer, which sits between compute and storage layer. Applications like Apache Spark or TensorFlow can then seamlessly access multiple disparate data sources with consistent performance using data locality and abstraction that the data orchestration tier brings.
This article aims to provide a different approach to help connect and make distributed files systems like HDFS or cloud storage systems look like a local file system to data processing frameworks: the Alluxio POSIX API. To explain the approach better, we used the TensorFlow + Alluxio + AWS S3 stack as an example.
Alluxio is a proud sponsor and exhibitor at the Presto Summit in San Francisco.
What’s Presto Summit? It’s the leading Presto conference co-organized by our partner Starburst Data and the Presto Software Foundation.
The data orchestration layer bridging the gap between data locality with improved performance and data accessibility for analytics workloads in Kubernetes, and enables portability across storage providers.
An overview of Alluxio and the cloud use case with Spark in Kubernetes. Learn how to set up Alluxio and Spark to run in Kubernetes.
As the data ecosystem becomes massively complex and more and more disaggregated, data analysts and end users have trouble adapting and working with hybrid environments. The proliferation of compute applications along with storage mediums leads to a hybrid model that we are just not accustomed to.
With this disaggregated system data engineers now come across a multitude of problems that they must overcome in order to get meaningful insights.
Twitter SF is hosting 2019’s half yearly RocksDB Meetup with speakers from Twitter, Facebook and the community on July 11th.
Join us June 24 in Menlo Park for our next meetup! We’ll have 3 valuable talks, a delicious BBQ dinner and amazing summertime-themed raffle prizes! This free event is sponsored by GridGain Systems and Oracle.
This Alluxio Meetup features a chance to interact with other Alluxio users and developers, as well as three talks. Thanks to our joint host Data Council!
A new generation of open source big data, represented by Alluxio, born at the University of California at Berkeley, looks at this issue. Different from systems such as designing storage tight coupling to achieve low-cost reliable storage HDFS, by providing a virtual data storage layer defined and implemented by software for data applications, abstracting and integrating cloudy, hybrid cloud, multi-data center and other environments The underlying files and objects, and through intelligent workload analysis and data management, make data close to computing and provide data locality, big data and machine learning applications can be achieved with the same performance and lower cost.