Many organizations deploy Alluxio together with Spark for performance gains and data manageability benefits. Qunar recently deployed Alluxio in production, and their Spark streaming jobs sped up by 15x on average and up to 300x during peak times. They noticed that some Spark jobs would slow down or would not finish, but with Alluxio, those jobs could finish quickly. In this blog post, we investigate how Alluxio helps Spark be more effective. Alluxio increases performance of Spark jobs, helps Spark jobs perform more predictably, and enables multiple Spark jobs to share the same data from memory.
Tag: apache spark
We briefly introduce Alluxio and present different ways Alluxio can help Spark jobs, along with best practices. We also discuss how Alluxio can be deployed and used with a Spark data processing pipeline in the cloud.
MesosCon Europe 2017 – Gene Pang discusses the architecture of Mesos, Spark and Alluxio to achieve an optimal architecture for enterprises.
Spark Summit SF 2017 – We briefly introduce Alluxio and present different ways Alluxio can help Spark jobs, along with best practices. We also discuss how Alluxio can be deployed and used with a Spark data processing pipeline in the cloud.
Global Big Data Conference 2017 – In the past year, the Alluxio project experienced significant improvement in performance and scalability and was extended with key new features including tiered storage, transparent naming, and unified namespace
VAULT Conference 2017 – Haoyuan Li, CEO of Alluxio (formerly Tachyon), presents a keynote on the San Mateo-based startup’s journey thus far and the road ahead.
Calvin Jia introduces Alluxio, explain how Alluxio can help Spark be more effective, show benchmark results with Spark RDDs and DataFrames, and describe production deployments with both Alluxio and Spark working together.
In this talk, we briefly introduce Alluxio, present several ways how Alluxio can help Spark be more effective, show benchmark results with Spark RDDs & DataFrames, and describe production deployments with both Alluxio and Spark working together.
Deep learning algorithms have traditionally been used in specific applications, most notably, computer vision, machine translation, text mining, and fraud detection. Deep learning truly shines when the model is big and trained on large-scale datasets. Meanwhile, distributed computing platforms like Spark are designed to handle big data and have been used extensively. Therefore, by having deep learning available on Spark, the application of deep learning is much broader, and now businesses can fully take advantage of deep learning capabilities using their existing Spark infrastructure.