hive metastore Archives

Integrate Alluxio With Your Existing Data Stack Without Redefining Hive Tables

December 20, 2022 By Greg Palmer and Hope Wang

One of the key features of Alluxio Enterprise Edition is Transparent URI, which provides ease of integration of Alluxio with your existing data stack without any changes to the location metadata of the Hive Metastore. This article provides a tutorial on employing the Alluxio Transparent URI capability with Trino, Hive Metastore and Spark, and with … Continued

Everything you want to know about how to decouple SQL engines from Hive Data Warehouse

March 30, 2020 By Gene Pang

Are you using SQL engines, such as Presto, to query existing Hive data warehouse and experiencing challenges including overloaded Hive Metastore with slow and unpredictable access, unoptimized data formats and layouts such as too many small files, or lack of influence over the existing Hive system and other Hive applications?

Introducing Wormhole: Dockerized Presto & Alluxio setups for blazing fast analytics

October 29, 2019 By Ashwin Sinha

This is a guest blog by Ashwin Sinha with an original blog source. This blog introduces Wormhole— open source Dockerized solution for deploying Presto & Alluxio clusters for blazing fast analytics on file system (we use S3, GCS, OSS). When it comes to analytics, generally people are hands-on in writing SQL queries and love to analyse data which resides in a warehouse (e.g. MySQL database). But as data grows, these … Continued

Tutorial: Presto + Alluxio + Hive Metastore on Your Laptop in 10 min

October 23, 2019 By Bin Fan

This tutorial guides users to set up a stack of Presto, Alluxio and Hive Metastore on your local server, and it demonstrates how to use Alluxio as the caching layer for Presto queries.

MOMO: Accelerating Ad Hoc Analysis with Spark SQL and Alluxio

March 20, 2018 By MOMO Team

Alluxio clusters act as a data access accelerator for remote data in connected storage systems. Temporarily storing data in memory, or other media near compute, accelerates access and provides local performance from remote storage. This capability is even more critical with the movement of compute applications to the cloud and data being located in object stores separate from compute. Caching is transparent to users, using read/write buffering to maintain continuity with persistent storage. Intelligent cache management utilizes configurable policies for efficient data placement and supports tiered storage for both memory and disk (SSD/HDD).

Tag: hive metastore