New Whitepaper Structured Big Data Federation

February 28, 2018

Enterprises are adopting big data technologies to analyze and derive insight from their growing volumes of structured and unstructured data. A familiar problem is the requirement to analyze data from multiple independent storage silos concurrently. In order to consolidate the data, large enterprises typically use custom solutions or build a data lake. These approaches present additional challenges and can be costly and time consuming. Alluxio helps organizations handle their big data by providing a unified view of all of the data in your enterprise – on premise, in the cloud, or hybrid. Applications access data using a standard interface to a global virtual namespace. Alluxio also employs a memory-centric architecture to enable data access at memory speed. With the combined unification and performance benefits, Alluxio can effectively provide big data federation for organizations by acting as a virtual data lake. We just published a whitepaper that goes into more detail on this common use case, you can access it here:Structured Big Data Federation Using Alluxio.

Share this post

Blog

Diagnose & Fix Slow Distributed Training

Got periodic drops in GPU utilization? GPU Stalls? Training capacity grinding to a halt? Learn how checkpoint writes could be the cause of your suddent, yet periodic drops in training performance.

20x Faster Training Data Reads with Alluxio and Ray on Anyscale: A Cross-Region Benchmark

Alluxio and Anyscale benchmark achieves 20x faster cross-region data reads for AI training workloads on GCS.

Alluxio AI 3.9 Brings Checkpoint Acceleration to Any AI Training Framework

Alluxio AI 3.9 introduces POSIX Write Cache, eliminating the checkpoint write bottleneck in distributed training with 7.6 GiB/s per node throughput and sub-2ms P99 latency. Get all of the details here!

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo