As the AI landscape rapidly evolves, the advancements in generative AI technologies, such as ChatGPT, are driving a need for robust data infrastructures tailored for large language model (LLM) training and inference in the cloud. To effectively leverage the breakthroughs in LLM, organizations must ensure low latency, high concurrency, and scalability in production environments.
In this Alluxio-hosted webinar, Shouwei will present on the design and implementation of a distributed caching system that addresses the I/O challenges of LLM training and inference. He will explore the unique requirements of data access patterns and offer practical best practices for optimizing the data pipeline through distributed caching in the cloud. The session will feature insights from real-world examples, such as Microsoft, Tencent, and Zhihu, as well as from the open-source community. Attendees will leave with a deeper understanding of how to harness scalable, efficient, and robust data infrastructures for LLM training and inference.
Dr. Shouwei Chen is a core maintainer and product manager of open-source Alluxio. Before joining Alluxio, Shouwei received a Ph.D. degree from Rutgers University. Shouwei’s research focuses on the codesign of the memory-centric computing frameworks with in-memory distributed file systems in large-scale environments.