On-Demand Videos
Nilesh Agarwal, Co-founder & CTO at Inferless, shares insights on accelerating LLM inference in the cloud using Alluxio, tackling key bottlenecks like slow model weight loading from S3 and lengthy container startup time. Inferless uses Alluxio as a three-tier cache system that dramatically cuts model load time by 10x.

In this talk, Jingwen Ouyang, Senior Product Manager at Alluxio, will share how Alluxio make it easy to share and manage data from any storage to any compute engine in any environment with high performance and low cost for your model training, model inference, and model distribution workload.

Storing data as Parquet files on cloud object storage, such as AWS S3, has become prevalent not only for large-scale data lakes but also as lightweight feature stores for training and inference, or as document stores for Retrieval-Augmented Generation (RAG). However, querying petabyte-to-exabyte-scale data lakes directly from S3 remains notoriously slow, with latencies typically ranging from hundreds of milliseconds to several seconds.
In this webinar, David Zhu, Software Engineering Manager at Alluxio, will present the results of a joint collaboration between Alluxio and a leading SaaS and data infrastructure enterprise that explored leveraging Alluxio as a high-performance caching and acceleration layer atop AWS S3 for ultra-fast querying of Parquet files at PB scale.
David will share:
- How Alluxio delivers sub-millisecond Time-to-First-Byte (TTFB) for Parquet queries, comparable to S3 Express One Zone, without requiring specialized hardware, data format changes, or data migration from your existing data lake.
- The architecture that enables Alluxio’s throughput to scale linearly with cluster size, achieving one million queries per second on a modest 50-node deployment, surpassing S3 Express single-account throughput by 50x without latency degradation.
- Specifics on how Alluxio offloads partial Parquet read operations and reduces overhead, enabling direct, ultra-low-latency point queries in hundreds of microseconds and achieving a 1,000x performance gain over traditional S3 querying methods.
Speaker: David Zhu
David Zhu is a Software Engineer Manager at Alluxio. At Alluxio, David focuses on metadata management and end-to-end performance benchmarking and optimizations. Prior to that, David completed his Ph.D. from UC Berkeley, with a focus on distributed data management systems and operating systems for the data center. David also holds a Bachelor of Software Engineering from the University of Waterloo.
.png)
At Ryte, we analyze unstructured, semi-structured and structured data for more than one million users worldwide. The whole Ryte-Platform is built with a scalable architecture to support our heavy load and make it possible for our customers to drill-down from a high-level overview into the last byte of their websites. Presto + Alluxio on steroids a romantic drama on Production with happy end from Alluxio, Inc.
Alluxio core maintainers and founding engineers share the latest innovations in Alluxio 2. Alluxio 2 Community Update from Alluxio, Inc.
Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Proven at scale in a variety of use cases at Airbnb, Comcast, GrubHub, Facebook, FINRA, LinkedIn, Lyft, Netflix, Twitter, and Uber, in the last few years Presto experienced an unprecedented growth in popularity in both on-premises and cloud deployments over Object Stores, HDFS, NoSQL and RDBMS data stores.
This talk will discuss best use cases for Presto from the Data Engineer’s perspective. In addition, we will present the recent Presto advancements such as Cost-Based Optimizer, Kubernetes-native deployment and the project roadmap going forward.
Today, one can easily launch or terminate services with hundreds or thousands of compute instances in just a few seconds on cloud services such as AWS. However, operating, monitoring and maintaining those resources could also easily become a nightmare if the corresponding systems were not designed in a cloud-native way.
In this talk, we share our lessons in building and rebuilding our monitoring systems and data platforms at Electronic Arts (EA). In the first generation of the monitoring system, configurations were manually created for many individual software components and spread over all the resources. As services were started and terminated rapidly over time, it was extremely difficult to keep all configurations up to date. Consequently, on average we received over 1,000 alerts from thousands of machines on a daily basis, which stressed the operations team. We redesigned the system in late 2018 in a project called Monitoring As Code (MAC) emphasizing on version control and automation. MAC manages all the configurations using a GIT project in the same way as software code. Moreover, it establishes standards so that the configurations are automatically generated and deployed to keep everything in sync. As a result, it reduced the daily average number of alerts by two orders of magnitude.
In the first generation of the data platform, we used HDFS as a cache layer between ETL jobs and the underlying AWS storage service S3. However, HDFS is not a special-purpose cache service, so custom code is needed to make it work like a cache. We have to run a backup workflow in every ETL job to backup data to S3 and sync the metadata store of the ETL jobs running on HDFS and that of interactive analytic queries running directly on S3. Moreover, we rely on complex and fragile mechanisms for purging datasets when the clusters are under heavy load. The use of HDFS also makes it a challenge to rapidly scale up the YARN cluster during peak hours and scale it down during off-hours. We are currently redesigning the data platform, mainly by replacing HDFS with a special-purpose data orchestration service called Alluxio. In our initial evaluation, Alluxio not only provides better performance than HDFS but also significantly simplifies the architecture of our data platform and makes it easy to scale up and down and paves the way to a cloud native ETL processing stack.
This DATA ORCHESTRATION SUMMIT session talks about challenges associated with querying diverse data sources at Walmart and how those are tackled using Presto & Alluxio.
How Alluxio caching was leveraged to provide consistent optimized query performance within and across clouds.
Also highlights implementation of critical components for Enterprise acceleration offering such as security integration for fine grained access control, auto-scaling & auto deployment in GCP.
In this panel, creators of open source projects share their stories from why they started the project to the challenges they encountered on the way.
Spark is a widely adopted open source framework that provides a unified interface for analytics and machine learning workloads. Alluxio, originating from the UC Berkeley AMPLab – the same lab as Spark, is an open source data orchestration platform that empowers compute frameworks like Spark by providing stateful caching to enable efficient data sharing between multiple jobs and improving resilience against job failures as well as bringing data together from many different sources, be it remote HDFS or cloud object stores.
Alluxio partnered with IBM to deliver a Spark-based solution to provide fast data analytics. With the integration of IBM Spectrum Conductor, an advanced workload and resource management platform that maximizes hardware utilization to speed results and cut infrastructure costs, Alluxio and IBM delivered a solution that powers leading telecom company’s applications to support 320 million subscribers. In this online meetup, we will present the benefits of the fast analytics stack of Spark on Alluxio and IBM and dive into a leading telecom’s use case of leveraging Spark and Alluxio to process massive amounts of mobile data.
In this online meetup, you will learn about:
- Why the leading companies are moving towards a decoupled compute and storage architecture, and the associated challenges and requirements.
- Why Spark and Alluxio together can solve the challenges and fulfill the requirements
- How leading telecom leverages Spark with Alluxio for fast data processing at scale on top of object store and HDFS
Using “zero-copy” hybrid bursting with Spark to solve capacity problems
Want to leverage your existing investments in Hadoop with your data on-premise and still benefit from the elasticity of the cloud?
Like other Hadoop users, you most likely experience very large and busy Hadoop clusters, particularly when it comes to compute capacity. Bursting HDFS data to the cloud can bring challenges – network latency impacts performance, copying data via DistCP means maintaining duplicate data, and you may have to make application changes to accomodate the use of S3.
“Zero-copy” hybrid bursting with Alluxio keeps your data on-prem and syncs data to compute in the cloud so you can expand compute capacity, particularly for ephemeral Spark jobs.
In this tech talk, we’ll discuss:
- Approaches to burst data to the cloud
- How Alluxio can enable “zero-copy” bursting of Spark workloads to cloud data services like EMR and Dataproc
- How DBS Bank uses Alluxio to solve for limited on-prem compute capacity by zero-copy bursting Spark workloads to AWS EMR
The data ecosystem has heavily evolved over the past two decades. There’s been an explosion of data-driven frameworks, such as Presto, Hive, and Spark to run analytics and ETL queries and TensorFlow and PyTorch to train and serve models. On the data side, the approach to managing and storing data has evolved from HDFS to cheaper, more scalable and separated services typified by cloud stores like AWS S3. As a result, data engineering has become increasingly complex, inefficient, and hard, particularly in hybrid and cloud environments.
Haoyuan Li offers an overview of a data orchestration layer that provides a unified data access and caching layer for single cloud, hybrid, and multicloud deployments. It enables distributed compute engines like Presto, TensorFlow, and PyTorch to transparently access data from various storage systems (including S3, HDFS, and Azure) while actively leveraging an in-memory cache to accelerate data access.
Many organizations are leveraging Hive to run big data analytics on public cloud. However, reading and writing data to S3 directly can result in slow and inconsistent performance. Alluxio is a data orchestration layer for the cloud, and in this use case it caches data for S3, ensuring high and predictable performance as well as reduced network traffic.
In this Office Hour we’ll go over:
- Bazaarvoice’s use case leveraging Apache Spark, Hive, and Alluxio on S3
- How to set up Hive with Alluxio such that Hive jobs can seamlessly read from and write to S3
- Open Session for discussion on any topics such as solving the separation of compute and storage problem, and more
ING Bank is a multinational financial services company headquartered in Amsterdam with over $1 trillion in assets. As a leading bank, we place a great emphasis on cybersecurity. One aspect of this is the Security incident and event management (SIEM), which is the process of identifying, monitoring, recording and analyzing security events or incidents within a real-time IT environment. SIEM requires our data platform to have high and consistent performance, so we use open source technologies Presto and Alluxio for fast SQL analytics in the cloud.
In this online presentation, we are going to present how ING is leveraging Presto (interactive query), Alluxio (data orchestration & acceleration), S3 (massive storage), and DC/OS (container orchestration) to build and operate our modern Security Analytics & Machine Learning platform. We will share the challenges we encountered and how we solved them. Today we run this platform in several different data centers, and we have reduced our 10+ minutes queries to under 10 seconds!
EMR has become a widely used service to run big data analytics in the public cloud. But issues around slow/inconsistent EMR performance due to S3 data lakes creates challenges for organizations.
Alluxio is a data orchestration layer for the cloud that increases performance of analytic workloads running on AWS EMR using S3 as the storage.
Join us for this webinar where we will show you how to set up EMR Spark and Hive with Alluxio so jobs can seamlessly read from and write to your S3 data lake. You’ll see the performance gains with Alluxio in your EMR/S3 stack.