Aunalytics Leverages Alluxio as a “one-stop-shop” for Data I/O

Alluxio is a leading data orchestration platform that offers a compute agnostic, storage agnostic, and cloud agnostic solution for big data and machine learning applications. Aunalytics is a data platform company delivering Insights-as-a-Service to answer enterprise and mid-sized companies’ most important IT and business questions.

Tags: , , , , , ,

The practice of Presto & Alluxio in E-commerce big data platform

JD.com is one of the largest e-commerce corporations. In big data platform of JD.com, there are tens of thousands of nodes and tens of petabytes off-line data which require millions of spark and MapReduce jobs to process everyday. As the main query engine, thousands of machines work as Presto nodes and Presto plays an import role in the field of In-place analysis and BI tools. Meanwhile, Alluxio is deployed to improve the performance of Presto. The practice of Presto & Alluxio in JD.com benefits a lot of engineers and analysts.

Tags: , , ,

Accelerate and Scale Big Data Analytics with Alluxio and Intel® Optane™ Persistent Memory

International Data Corporation (IDC) reported that the global datasphere will grow from 33 zettabytes in 2018 to 175 zettabytes by 20251. This trend becomes more and more complicated with the variety and velocity of data growth, and it continuously changes the ways data is collected, stored, processed, and analyzed. New analytics solutions, including machine learning, deep learning, and artificial intelligence (AI), and new architectures and tools are being developed to extract and deliver value from the huge datasphere. 

Tags: , , , , , ,

Alluxio Accelerates Deep Learning in Hybrid Cloud using Intel’s Analytics Zoo open source platform powered by oneAPI

This article describes how Alluxio accelerates the training of deep learning models in a hybrid cloud environment with Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed.

Tags: , , , , , , , ,

Running Presto with Alluxio on Amazon EMR

Many organizations are leveraging EMR to run big data analytics on public cloud. However, reading and writing data to S3 directly can result in slow and inconsistent performance. Alluxio is a data orchestration layer for the cloud, and in this use case it caches data for S3, ensuring high and predictable performance as well as reduced network traffic.

Tags: , , , , , , ,

Building data lineage; Running Spark with Alluxio; Data Mesh

Big Data Application Meetup *

Running Spark with Alluxio is a popular stack particularly for hybrid environments. In this session, Dipti will briefly introduce Alluxio, share the top 10 tips for performance tuning for real-world workloads, and demo Alluxio with Spark.