Enterprises are adopting big data technologies to analyze and derive insight from their growing volumes of structured and unstructured data. A familiar problem is the requirement to analyze data from multiple independent storage silos concurrently. In order to consolidate the data, large enterprises typically use custom solutions or build a data lake. These approaches present additional challenges and can be costly and time consuming.
Introduction Many organizations deploy Alluxio together with Spark for performance gains and data manageability benefits. Qunar recently deployed Alluxio in production, and their Spark streaming jobs sped up by 15x on average and up to 300x during peak times. They noticed that some Spark jobs would slow down or would not finish, but with Alluxio, those … Continued
For business to not just survive — but to flourish — it’s become imperative to make decisions with near immediacy, continuously pivot strategy and tactics, and merge streams of inquiries into meaningful action. Executing requires high-frequency insights — the competitive advantage in today’s frenetic business landscape. Together with Alluxio, Inc., we enable businesses to gain the … Continued
Alluxio, ￼￼￼￼￼formerly Tachyon, is the world’s first system which unifies data at memory speeds while achieving affordability through Alluxio’s innovative tiered storage functionality. This Samsung whitepaper shows how Alluxio’s storage can be used with different storage media available in systems including NVME SSDs while providing in‐line performance consistent with the speed of the underlying storage media. Alluxio provides the capability to leverage all the storage that is available in a system.
Understand the benefits Alluxio brings to analytics on object storage. Derive timely insights from data with memory-speed access Enable data sharing between applications without sacrificing performance Reduce costs with efficient memory utilization
Learn how Alluxio is used in clusters with co-located compute and storage to improve two key metrics of Data Analytics Clusters: Performance predictability allowing SLAs to be met more easily. Up to 10x improved performance.
In this article, we show by saving RDDs in Alluxio, Alluxio can keep larger data sets in-memory for faster Spark applications, as well as enable sharing of RDDs across separate Spark applications.
Learn how to deploy this architecture and the value Alluxio brings to the stack.
Introduction The exponential growth of the raw computational power, communication bandwidth, and storage capacity results in continuous innovation in how data is processed and stored. To address the evolving nature of the compute and storage landscape, we are continuously advancing Alluxio, a state-of-the-art memory-centric virtual distributed storage system. This blog post highlights unified namespace, an … Continued