Background Today’s advanced analytics applications run on more datasets that ever before. The locations of where data “lands” is becoming more dispersed. And the separation of compute and storage in modern environments lends well to running on these distributed datasets. Data can be stored in a remote location from the compute, such as in a … Continued
Tag: hybrid cloud
This webinar gives a quick overview of Alluxio and the use cases it powers for Spark/Presto in Kubernetes and how to set up to run in Kubernetes.
We will introduce the key new features and enhancements such as: Support for hyper-scale data workloads, Machine learning and deep learning workloads, and Better storage abstraction.
Alluxio is a proud sponsor and exhibitor at the AWS Summit in New York. If you weren’t able to attend, here are the highlights
Joint hosted Alluxio New York meetup with talks to include: Embracing hybrid cloud for data-intensive analytic workloads and Alluxio on AWS EMR (fast storage access and sharing for Spark).
In this meetup, Dipti and HY will present a new approach to hybrid analytical workloads using Alluxio, an open source data orchestration layer, which sits between compute and storage layer. Applications like Apache Spark or TensorFlow can then seamlessly access multiple disparate data sources with consistent performance using data locality and abstraction that the data orchestration tier brings.
Alluxio is a proud sponsor and exhibitor at the Presto Summit in San Francisco.
What’s Presto Summit? It’s the leading Presto conference co-organized by our partner Starburst Data and the Presto Software Foundation.
This whitepaper details how to leverage any public cloud (AWS, Google Cloud Platform, or Microsoft Azure) to scale analytics workloads directly on on-prem data without copying and synchronizing the data into the cloud. We will show an example of what it might look like to run on-demand Starburst Presto, Spark, and Hive with Alluxio in the public cloud using on-prem HDFS.
The paper also includes a real world case study on a leading hedge fund based in New York City, who deployed large clusters of Google Compute Engine VMs with Spark and Alluxio using on-prem HDFS as the underlying storage tier.
Haoyuan Li’s keynote at O’Reilly Beijing discusses open source data orchestration and the value of leveraging Alluxio with rising trends driving the need for a new architecture. Four big trends driving this need: Separation of compute & storage, hybrid-multi cloud environments, rise of object store and self-service data across the enterprise.