Developer and Engineering Archives | Page 7 of 12

Everything you want to know about how to decouple SQL engines from Hive Data Warehouse

March 30, 2020 By Gene Pang

Are you using SQL engines, such as Presto, to query existing Hive data warehouse and experiencing challenges including overloaded Hive Metastore with slow and unpredictable access, unoptimized data formats and layouts such as too many small files, or lack of influence over the existing Hive system and other Hive applications?

Serving Structured Data in Alluxio: Example

March 11, 2020 By Gene Pang

This article goes through a simple example to illustrate how Structured Data Management available in the latest Alluxio 2.2.0 release to help SQL and structured data workloads.

Serving Structured Data in Alluxio: Concept

March 11, 2020 By Gene Pang

This article introduces Structured Data Management available in the latest Alluxio 2.2.0 release, a new effort to provide further benefits to SQL and structured data workloads using Alluxio.

What’s new in Alluxio 2.2

March 11, 2020 By Bin Fan, Gene Pang, Zac Blanco and Haoyuan Li

With this release comes the General Availability (GA) of Alluxio Structured Data Services (SDS), the subsystem of Alluxio responsible for managing and transforming structured data, such as databases, tables, and partitions.

Kubernetes, Alluxio and the Disaggregated Analytics Stack

November 20, 2019 By Dipti Borkar

Kubernetes, Alluxio and the disaggregated analytics stack TL;DR: First the news – Alluxio support for K8s Helm charts now available! K8s is a certified environment for Alluxio. Now the take away- Alluxio brings back data locality for the disaggregated analytics stack in K8s. How? Read on. There’s no arguing the rise of containers in real-world … Continued

Improving Spark Memory Resource with Off-Heap In-Memory Storage

November 1, 2019 By Bin Fan and Adit Madan

In the previous tutorial ”Getting Started with Spark Caching using Alluxio in 5 Minutes”, we demonstrated how to get started with Spark and Alluxio. To share more thoughts and experiments on how Alluxio enhances Spark workloads, this article focuses on how Alluxio helps to optimize the memory utilization of Spark applications. For users who are … Continued

Introducing Wormhole: Dockerized Presto & Alluxio setups for blazing fast analytics

October 29, 2019 By Ashwin Sinha

This is a guest blog by Ashwin Sinha with an original blog source. This blog introduces Wormhole— open source Dockerized solution for deploying Presto & Alluxio clusters for blazing fast analytics on file system (we use S3, GCS, OSS). When it comes to analytics, generally people are hands-on in writing SQL queries and love to analyse data which resides in a warehouse (e.g. MySQL database). But as data grows, these … Continued

Tutorial: Presto + Alluxio + Hive Metastore on Your Laptop in 10 min

October 23, 2019 By Bin Fan

This tutorial guides users to set up a stack of Presto, Alluxio and Hive Metastore on your local server, and it demonstrates how to use Alluxio as the caching layer for Presto queries.

Category: Developer and Engineering