Many organizations deploy Alluxio together with Spark for performance gains and data manageability benefits. Qunar recently deployed Alluxio in production, and their Spark streaming jobs sped up by 15x on average and up to 300x during peak times. They noticed that some Spark jobs would slow down or would not finish, but with Alluxio, those jobs could finish quickly. In this blog post, we investigate how Alluxio helps Spark be more effective. Alluxio increases performance of Spark jobs, helps Spark jobs perform more predictably, and enables multiple Spark jobs to share the same data from memory.
In the past year, the Alluxio project experienced significant improvement in performance and scalability and was extended with key new features including tiered storage, transparent naming, and unified namespace.
Lucene/SOLR Revolution 2017 – Timothy Potter, Lucidworks introduces Alluxio, the fastest growing open source project in the big data ecosystem, and shows how to leverage it for optimizing Solr performance. Basic integration scenarios and performance comparisons are also presented.
In a real development environment our customers leverage ArcGIS to read and write geospatial data to a plethora of distributed data stores, such as Amazon S3, HDFS, or OpenStack Swift, and some of these data stores are not natively supported by the ArcGIS platform…
Strata Data Conference London 2017 – Learn about stream processing on Alluxio from real-world workloads at Qunar, as well as how to position Alluxio in the streaming architecture
Global Big Data Conference 2017 – In the past year, the Alluxio project experienced significant improvement in performance and scalability and was extended with key new features including tiered storage, transparent naming, and unified namespace
Calvin Jia introduces Alluxio, explain how Alluxio can help Spark be more effective, show benchmark results with Spark RDDs and DataFrames, and describe production deployments with both Alluxio and Spark working together.
Today, we’re excited to announce our partnership with Mesosphere to enable fast on-demand analytics with Alluxio via Mesosphere’s DC/OS in one-click. This partnership is a natural extension of the synergy between Alluxio and DC/OS. Alluxio, the world’s first system that unifies data at memory speed, allows enterprises to manage and analyze data stored across disparate storage systems on premise and in the cloud at memory speed. Mesosphere brings enterprises the power of cloud native technologies, with the control to run on any infrastructure – datacenter or cloud.
In this talk, we briefly introduce Alluxio, present several ways how Alluxio can help Spark be more effective, show benchmark results with Spark RDDs & DataFrames, and describe production deployments with both Alluxio and Spark working together.