Machine Learning Model Training with Alluxio: Part 2 – Comparable Analysis

This blog is the second in the machine learning series following the previous one, which discussed Alluxio’s solution to improve training performance and simplify data management. With the help of Alluxio, loading data from cloud storage, training and caching data can be done in a transparent and distributed way as a part of the training process, thus improving training performance and simplifying data management. In this blog 2 of the series, we focus on comparing traditional solutions with Alluxio’s.

Data Consistency Model in Alluxio

When applications are only reading and writing through Alluxio, the Alluxio file system provides strong consistency. However, when clients are writing data across both Alluxio and under storage, the consistency depends on the Alluxio write type and under storage type. This article discusses what to expect in each scenario.

Accelerating and Scaling Big Data Analytics with Alluxio and Intel® Optane™ Persistent Memory

Testing Methodology Decision support workload is a typical workload that models multiple aspects of a decision support system, including queries and data maintenance. We selected 54 queries that represent a typical SQL query behavior in Hadoop for the test.  The tests include three different configurations: Without Alluxio, Alluxio on PMem and Alluxio on DRAM. The … Continued

What’s new in Alluxio 2.2

With this release comes the General Availability (GA) of Alluxio Structured Data Services (SDS), the subsystem of Alluxio responsible for managing and transforming structured data, such as databases, tables, and partitions.