From time to time, a question pops up on the user mailing list referencing job failures with the error message “java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found“. This post explains the reason for the failure and the solution to the issue when it occurs. Why does this happen? This error indicates the Alluxio client is not available at runtime. … Continued
Tag: apache spark
Learn how Intel uses Alluxio to accelerate big data analytics in the cloud, as well as new opportunities with persistent memory with separated compute and storage.
See results of 10x performance in Spark and Hive jobs that are running on AWS S3 by implementing the above. Plus, learn how real world user Bazaarvoice implemented a tiered storage architecture for a boost in performance, enabling them to handle data at massive Internet-scale to serve its customers.
This is a guest blog from our friends at TalkingData. Download or print the case study here TalkingData is China’s largest data broker, reaching more than 600 million smart devices on a monthly basis. TalkingData processes over 20 terabytes of data and more than one billion session requests every day. TalkingData products are powered by its massive … Continued
Highlights: Improved customer responsiveness and increased revenue Interactive analytics/reporting and faster time to insight Download or print the case study here. Myntra, a division of Flipkart, is a leading Indian e-commerce fashion retailer offering customers a wide range of clothing and other merchandise through a mobile application. Mobile devices drive 95 percent of the traffic to … Continued
This post is guest authored from our friends at Tencent: Can He Download or print the case study here Tencent is one of the largest technology companies in the world and a leader in multiple sectors such as social networking, gaming, e-commerce, mobile and web portal. Tencent News, one of Tencent’s many offerings, strives to create a … Continued
This post is guest authored by our friends at MOMO: Haojun (Reid) Chan and Wenchun Xu Data Analysis Trends The hadoop ecosystem makes many distributed system/algorithms easier to use and generally lowers the cost of operations. However, enterprises and vendors are never satisfied with that, so higher performance becomes the next issue. We considered several options … Continued
From our friends at MOMO The hadoop ecosystem makes many distributed system/algorithms easier to use and generally lowers the cost of operations. However, enterprises and vendors are never satisfied with that, so higher performance becomes the next issue. We considered several options to address our performance needs and focused our efforts on Alluxio, which improves performance … Continued
Enabling Decoupled Compute and Storage with Alluxio This blog explores the benefits Alluxio brings to data platforms, including: The trends behind the rise of decoupled compute-storage architectures How Alluxio addresses data access issues for decoupled compute-storage architectures An example of Alluxio’s benefits using a SparkSQL workload Motivation The primary appeal of a coupled compute-storage architecture, … Continued