Cloud storage brings great flexibility in management and cost-efficiency to data scientists, but also introduces new challenges related to data accessibility and data locality for machine learning applications. For instance, when the input data is stored in a remote cloud storage like AWS S3 or Azure blob storage, direct data access is often slow and … Continued
Tag: compute storage separation
Learn why leading companies are moving towards a decoupled compute and storage architecture, and the associated challenges and requirements. Hear about how Spark and Alluxio together can solve the challenges.
Learn more about Bazaarvoice’s use case leveraging Apache Spark, Hive, and Alluxio on S3. Along with how to set up Hive with Alluxio so that Hive jobs can seamlessly read from/write to S3.
In this online meetup, we will present the benefits of the fast analytics stack of Spark on Alluxio, and dive into China Unicom’s use case of leveraging Spark and Alluxio to process massive amounts of mobile data.
Today’s current pace of innovation is hindered by the necessity of reinventing the wheel in order for applications to efficiently access data. When an engineer or scientist wants to write an application to solve a problem, he or she needs to spend significant effort on getting the application to access the data efficiently and effectively, rather than focusing on the algorithms and the application’s logic.
Bay Area Meetup which include presentations on the architecture of Presto, its separation of compute and storage, cloud-readiness, recent advancements in the project such as Cost-Based Optimizer and Kubernetes Support. Presto and Alluxio production use cases and more.
This meetup presents an overview of the motivations and design decisions behind the major changes in the Alluxio 2.0 release, and Real-time Data Processing for Sales Attribution Analysis with Alluxio, Spark and Hive at VIPShop.
Alluxio is a proud sponsor and exhibitor at the AWS Summit in New York. If you weren’t able to attend, here are the highlights
Joint hosted Alluxio New York meetup with talks to include: Embracing hybrid cloud for data-intensive analytic workloads and Alluxio on AWS EMR (fast storage access and sharing for Spark).