The Practice of Alluxio in Ctrip Real-Time Computing Platform

Today, real-time computation platform is becoming increasingly important in many organizations. In this article, we will describe how ctrip.com applies Alluxio to accelerate the Spark SQL real-time jobs and maintain the jobs’ consistency during the downtime of our internal data lake (HDFS). In addition, we leverage Alluxio as a caching layer to dramatically reduce the workload pressure on our HDFS NameNode.

Recap: Presto Summit SF 2019

Alluxio is a proud sponsor and exhibitor at the Presto Summit in San Francisco.
What’s Presto Summit? It’s the leading Presto conference co-organized by our partner Starburst Data and the Presto Software Foundation.

Building Fast SQL Analytics with Presto, Alluxio, and S3

Alluxio Community Office Hour *

Learn how to set up Presto with Alluxio such that Presto jobs can seamlessly read from and write to S3.
Compare the performance between Presto on S3 with Presto and Alluxio on S3.

Summertime themed In-Memory Computing extravaganza! (cross-post)

New York Meetup *

[Talk 1] A “how-to” presentation for building a real-time alerting, analytics and reporting system (at scale). With Denis Magda, vice president of the Apache Ignite PMC and director of product management at GridGain Systems. And Viktor Gamov, developer advocate at Confluent.
[Talk 2] Using In-Memory technology for real time analytics. With Andy Rivenes is a Product Manager at Oracle for Database In-Memory.
[Talk 3] Feeding data to the Kubernetes beast: bringing data locality to your containerized big data workloads. With Bin Fan, founding engineer of Alluxio, Inc. and PMC member of Alluxio open source project.

How do you partition Hive Table across storage systems using Alluxio?

Today when we create a Hive table, it is a common technique to partition the table across different values and ranges to improve query performance and reduce maintenance cost. However, Hive can not  access a single table directly using a single query with the data of this Hive table across different mediums of storage and … Continued

Recap: Spark+AI Summit 2019

Alluxio is a proud sponsor and exhibitor of Spark+AI Summit in San Francisco.
What’s Spark+AI Summit? It’s the world’s largest conference that is focused on Apache Spark – Alluxio’s older cousin open source project from the same lab (UC Berkeley’s AMPLab – now RISElab).

Spark+AI Summit SF 2019

SAIS 2019 *

What’s Spark+AI Summit? It’s the world’s largest conference that is focused on Apache Spark – Alluxio’s older cousin open source project from the same lab (UC Berkeley’s AMPLab – now RISElab).

Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics

Spark Summit East *

In this presentation, William Callaghan will focus on the challenges faced and lessons learned in building a human-in-the loop cyber threat analytics pipeline. They will discuss the topic of analytics in cybersecurity and highlight the use of technologies such as Spark Streaming/SQL, Cassandra, Kafka and Alluxio in creating an analytics architecture with missions-critical response times.