Hear about the challenges and evolution of data orchestration at Rakuten data system with the collaboration of Alluxio.
Slides from our latest talks
In this session, we share common design patterns AWS customers are applying as part of their Data and AI journey.
Learn more about Alluxio’s structured data management, developer preview in Alluxio 2.1.0 and catch the demo.
Learn about Alibaba’s use case in deep learning and gene computing acceleration using Alluxio in Kubernetes.
This talk includes why Netflix needed to build Iceberg, the project’s high-level design, and will highlight the details that unblock better query performance.
This talk covers an overview of the project and highlight best practices for creating performant input pipelines.
ODSC WEST 2019 Cloud storage brings great flexibility in management and cost-efficiency to data scientists, but also introduces new challenges related to data accessibility and data locality for machine learning applications. For instance, when the input data is stored in a remote cloud storage like AWS S3 or Azure blob storage, direct data access is … Continued
Learn why leading companies are moving towards a decoupled compute and storage architecture, and the associated challenges and requirements. Hear about how Spark and Alluxio together can solve the challenges.
Want to leverage your existing investments in Hadoop with your data on-premise and still benefit from the elasticity of the cloud?
Like other Hadoop users, you most likely experience very large and busy Hadoop clusters, particularly when it comes to compute capacity. Bursting HDFS data to the cloud can bring challenges – network latency impacts performance, copying data via DistCP means maintaining duplicate data, and you may have to make application changes to accomodate the use of S3.
“Zero-copy” hybrid bursting with Alluxio keeps your data on-prem and syncs data to compute in the cloud so you can expand compute capacity, particularly for ephemeral Spark jobs.