Apache Hudi : The Path Forward
deep dive into two important areas of active development going forward – table metadata management and caching.
Tags: alluxio day, apache hudi, caching, data lake, metadata management
deep dive into two important areas of active development going forward – table metadata management and caching.
Tags: alluxio day, apache hudi, caching, data lake, metadata management
In this talk, we will walk through what Alluxio’s Data Orchestration for the hybrid cloud era is and how it solves the performance and data management challenges we see.
In this talk, we will walk through what Alluxio’s Data Orchestration for the hybrid cloud era is and how it solves the performance and data management challenges we see.
In this presentation, we will discuss the use of the intelligent precomputation capabilities of Kyligence Cloud as a means of delivering on the promise of pervasive analytics at scale with massive concurrency and sub-second query latencies on large datasets in the cloud.
Tags: cloud storage, data lake, data orchestration, data orchestration summit, kyligence
This talk introduces T3Go’s solution in building an enterprise-level data lake based on Apache Hudi & Alluxio, and how to use Alluxio to accelerate the reading and writing of data on the data lake when compute and storage are segregated.
Tags: apache hudi, compute storage separation, data lake, data orchestration, data orchestration summit
How T3Go’s high-performance data lake using Apache Hudi and Alluxio shortened the time for data ingestion into the lake by up to a factor of 2. Data analysts using Presto, Hudi, and Alluxio in conjunction to query data on the lake saw queries speed up by 10 times faster.
In this talk, we describe the architecture to migrate analytics workloads incrementally to any public cloud (AWS, Google Cloud Platform, or Microsoft Azure) directly on on-prem data without copying the data to cloud storage.
Tags: cloud, data analytics, data lake, hdfs, hybrid, on-prem, presto, spark, storage
In this talk, we will walk through what Alluxio’s Data Orchestration for the hybrid cloud era is and how it solves the performance and data management challenges we see.
Many companies we talk to have on premises data lakes and use the cloud(s) to burst compute. Many are now establishing new object data lakes as well. As a result, running analytics such as Hive, Spark, Presto and machine learning are experiencing sluggish response times with data and compute in multiple locations. We also know there is an immense and growing data management burden to support these workflows.
Tags: analytics, data lake, data management, data orchestration, hybrid cloud, machine learning, performance, webinar