Online Meetup: Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3

In this online presentation, we present how ING is leveraging Presto (interactive query), Alluxio (data orchestration & acceleration), S3 (massive storage), and DC/OS (container orchestration) to build and operate our modern Security Analytics & Machine Learning platform. We will share the challenges we encountered and how we solved them.

Tags: , , , ,

Effective Analytical Pipelines on AWS Using EMR, Alluxio, and S3

This article describes my lessons from a previous project which moved a data pipeline originally running on a Hadoop cluster managed by my team, to AWS using EMR and S3. The goal was to leverage the elasticity of EMR to offload the operational work, as well as make S3 a data lake where different teams can easily share data across projects.

Accelerating Hive with Alluxio on S3

Alluxio Community Office Hour *

Hear about Bazaarvoice’s use case leveraging Apache Spark, Hive, and Alluxio on S3. And learn how to set up Hive with Alluxio so that Hive jobs can seamlessly read/write to S3.

Four Different Ways to Write to Alluxio

Alluxio is a new layer on top of under storage systems that can not only improve raw I/O performance but also enables applications flexible options to read, write and manage files. This article focuses on describing different ways to write files to Alluxio, realizing the tradeoffs in performance, consistency, and also the level of fault tolerance compared to HDFS.