Introduction As the amount of data being collected and analyzed by Enterprises continues to grow unabated, more attention is being placed on managing the cost of storing the data relative to performance. Hadoop provides a scalable and fast way of storing and analyzing data, however, the cost of storing data in Hadoop is typically higher … Continued
Tag: aws s3
Learn how Intel uses Alluxio to accelerate big data analytics in the cloud, as well as new opportunities with persistent memory with separated compute and storage.
See results of 10x performance in Spark and Hive jobs that are running on AWS S3 by implementing the above. Plus, learn how real world user Bazaarvoice implemented a tiered storage architecture for a boost in performance, enabling them to handle data at massive Internet-scale to serve its customers.
Small (kilobyte-sized) objects are the bane of highly scalable cloud object stores. Larger (at least megabytesized) objects not only improve performance, but also result in orders of magnitude lower cost, due to the current operation-based pricing model of commodity cloud object stores. For example, in Amazon S3’s current pricing scheme, uploading 1GiB data by issuing … Continued
Myntra, a division of Flipkart, is a leading fashion retailer in India offering customers a wide range of merchandise through a mobile application. An analytics pipeline in Amazon Web Services (AWS) cloud processes customer data to make recommendations, present ads, and deliver other aspects of a tailored experience. Myntra deployed Alluxio to provide a virtual … Continued
Highlights: Improved customer responsiveness and increased revenue Interactive analytics/reporting and faster time to insight Download or print the case study here. Myntra, a division of Flipkart, is a leading Indian e-commerce fashion retailer offering customers a wide range of clothing and other merchandise through a mobile application. Mobile devices drive 95 percent of the traffic to … Continued
Enabling Decoupled Compute and Storage with Alluxio This blog explores the benefits Alluxio brings to data platforms, including: The trends behind the rise of decoupled compute-storage architectures How Alluxio addresses data access issues for decoupled compute-storage architectures An example of Alluxio’s benefits using a SparkSQL workload Motivation The primary appeal of a coupled compute-storage architecture, … Continued
Author: Shaofeng Shi (firstname.lastname@example.org), Senior Architect, Kyligence Inc. OLAP (on-line analytical processing) technology has been widely adopted by enterprises since last century; Enterprises rely on OLAP to analyze their huge amount of data, generate reporting and so to help business people making decisions. Today in the era of big data, OLAP becomes more important and … Continued
Organizations commonly use Apache Spark to gain actionable insight from their large amounts of data. Often, these analytics are in the form of data processing pipelines, where there are a series of processing stages, and each stage performs a particular function, and the output of one stage is the input of the next stage. There … Continued