Alluxio foresaw the need for agility when accessing data across silos separated from compute engines like Spark, Presto, Tensorflow and PyTorch. Embracing the separation of storage from compute, the Alluxio data orchestration platform simplifies adoption of the data lake and data mesh paradigm for analytics and AI/ML.
COMMUNITY VIRTUAL EVENT
Learn how Alluxio uses Apache Ranger’s centralized access policies to control access to virtual paths in the Alluxio virtual file system and enforce existing access policies for the HDFS under stores.
Check out the talks from our virtual community event, Alluxio Day XII, featuring presenters from Websec, Shopee, and Alluxio.
Alluxio 2.8 expands data access & security for data-driven applications in heterogeneous environments – Enhanced S3 API, data encryption & policy-driven data management, and more.
Alluxio enables compute
Bring your data close to compute.
Make your data local to compute workloads for Spark caching, Presto caching, Hive caching and more.
Make your data accessible.
No matter if it sits on-prem or in the cloud, HDFS or S3, make your files and objects accessible in many different ways.
Make your data as elastic as compute.
Effortlessly orchestrate your data for compute in any cloud, even if data is spread across multiple clouds.
“zero-copy” burst user spotlight: walmart
Why Walmart chose Alluxio’s “Zero-Copy” burst solution:
- No requirement to persist data into the cloud
- Improved query performance and no network hops on recurrent queries
- Lower costs without the need for creating data copies
Featured Use Cases and Deployments
Zero-copy hybrid bursting with no app changes to intelligently make remote data accessible in the public cloud.
Zero-copy bursting across data centers for Presto, Spark, and Hive with no app changes on data stored in HDFS.
Interact with Alluxio in any stack
Pick a compute. Pick a storage. Alluxio just works.
// Using Alluxio as input and output for RDD scala> sc.textFile("alluxio://master:19998/Input") scala> rdd.saveAsTextFile("alluxio://master:19998/Output") // Using Alluxio as input and output for Dataframe scala> df = sqlContext.read.parquet("alluxio://master:19998/Input.parquet") scala> df.write.parquet("alluxio://master:19998/Output.parquet”)
-- Pointing Table location to Alluxio hive> CREATE TABLE u_user ( userid INT, age INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 'alluxio://master:port/table_data';
Create and Query table stored in Alluxio hbase(main):001:0> create 'test', 'cf' hbase(main):002:0> list ‘test'
# Accessing Alluxio after mounting Alluxio service to local file system $ ls /mnt/alluxio_mount $ cat /mnt/alluxio_mount/mydata.txt
powered by alluxio
SAN MATEO, CA – June 15, 2022 – Alluxio, the developer of the open source data orchestration platform for data driven workloads such as large-scale analytics and AI/ML, today announced it will present a session at the Linux Foundation’s Open Source Summit about strategies for building super-contributors in an open source community. The event is … Continued
Today, data-driven benefits abound. However, the ability to seize new business opportunities, create new products, and deal effectively with competitive issues requires strong data management and analytics capabilities.
Imagine as an IT leader having the flexibility to choose any services that are available in public cloud and on premises. And imagine being able to scale your storage for your data lakes with control over data locality and protection for your organization. With these goals in mind, NetApp and Alluxio are joining forces to help our customers adapt to new requirements for modernizing data architecture with low-touch operations for analytics, machine learning, and artificial intelligence workflows.
Today, many organizations are running a multitude of data-driven applications and data platforms that span multiple geographic regions and across heterogeneous environments – public, … Continued
By bringing Alluxio together with Spark, you can modernize your data platform in a scalable, agile, and cost-effective way. In this post, we provide … Continued