This Alluxio Meetup features a chance to interact with other Alluxio users and developers, as well as three talks. Thanks to our joint host Data Council!
Bring your data to compute with open source
Data orchestration for analytics and machine learning in the cloud
Community Office Hour, June 25th
Accelerating Spark in Kubernetes using Alluxio
Scalable metadata service in Alluxio: storing billions of files
Webinar, June 27th
Accelerate Spark workloads on S3 with Alluxio
Featured Use Cases and Deployments
Data in the public cloud slowing your compute down?
Get in-memory data access for Spark and Presto on AWS S3, Google Cloud Platform, or Microsoft Azure.
Can’t burst HDFS in your hybrid cloud environment?
Simplify Hadoop for the hybrid cloud by making on-prem HDFS accessible to any compute in the cloud.
Data in on-premise object stores not fast enough?
Accelerate your Spark, Presto, and Tensorflow workloads for object stores on-premise or in the cloud.
Interact with Alluxio in any stack
Pick a compute. Pick a storage. Alluxio just works.
// Using Alluxio as input and output for RDD scala> sc.textFile("alluxio://master:19998/Input") scala> rdd.saveAsTextFile("alluxio://master:19998/Output") // Using Alluxio as input and output for Dataframe scala> df = sqlContext.read.parquet("alluxio://master:19998/Input.parquet") scala> df.write.parquet("alluxio://master:19998/Output.parquet")
-- Pointing Table location to Alluxio CREATE SCHEMA hive.web WITH (location = 'alluxio://master:port/my-table/')
Create and Query table stored in Alluxio hbase(main):001:0> create 'test', 'cf' hbase(main):002:0> list 'test'
-- Pointing Table location to Alluxio hive> CREATE TABLE u_user ( userid INT, age INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 'alluxio://master:port/table_data';
# Accessing Alluxio after mounting Alluxio service to local file system $ ls /mnt/alluxio_mount $ cat /mnt/alluxio_mount/mydata.txt
Alluxio enables compute
powered by alluxio
Register for this webinar to learn how to run EMR Spark on Alluxio as a distributed file system cache for S3.
We are in the early stages of the data revolution. Organizations are racing to build data-driven cultures and innovate on data-driven applications. These applications impact many facets of our lives from the way we get to work to how we are medically diagnosed. However, the value of the data is far from being fully utilized … Continued