Spark Pipelines in the Cloud with Alluxio

Organizations commonly use Apache Spark to gain actionable insight from their large amounts of data. Often, these analytics are in the form of data processing pipelines, where there are a series of processing stages, and each stage performs a particular function, and the output of one stage is the input of the next stage. There … Continued

Tags: , ,

Best Practices for Using Alluxio with Apache Spark

Alluxio, formerly Tachyon, is a memory speed virtual distributed storage system that leverages memory for storing data and accelerating access to data in different storage systems. Many organizations and deployments use Alluxio with Apache Spark, and some of them scale out to over petabytes of data. Alluxio can enable Spark to be even more effective, … Continued

Fast, On-demand Analytics with Alluxio on DC/OS

Fast, On-demand Analytics with Alluxio on DC/OS Joint Webinar: Mesosphere | Alluxio Keith Chambers, Product Manager at Mesosphere Neena Pemmaraju, VP Products at Alluxio Adit Madan, Software Engineer at Alluxio

Effective Spark with Alluxio

Alluxio, formerly Tachyon, is a memory speed virtual distributed storage system and leverages memory for storing data and accelerating access to data in different storage systems.. Alluxio has a quickly growing open source community of developers and users and is deployed at such organizations as Alibaba, Baidu, Barclays, Intel, Huawei, and Qunar. Many of these … Continued

Fighting Cybercrime: A Joint Task Force of Real Time Data and Human Analytics

Cybercrime is big business. Gartner reports worldwide security spending at $80B, with annual losses totalling more than $1.2T (in 2015). Small to medium sized businesses now account for more than half of the attacks targeting enterprises today. The threat actors behind these attacks are continually shifting their techniques and toolkits to evade the security defenses … Continued