Effective Spark RDDs with Alluxio

Introduction Organizations like Baidu and Barclays have deployed Alluxio with Spark in their architecture, and have achieved impressive benefits and gains. Recently, Qunar deployed Alluxio with Spark in production and found that Alluxio enables Spark streaming jobs to run 15x to 300x faster. In their case study, they described how Alluxio improved their system architecture, and mentioned that … Continued

Getting Started with Alluxio and Spark

Introduction Spark has brought significant innovation to Big Data computing, but its results are even more extraordinary when paired with other open source projects in the ecosystem. Alluxio, formerly Tachyon, provides Spark with a reliable data sharing layer, enabling Spark to excel at performing application logic while Alluxio handles storage. For example, global financial powerhouse … Continued