At PayPal & any other data driven enterprise – data users & applications work with a variety of data sources (RDBMS, NoSQL, Messaging, Documents, Big Data, Time Series Databases), compute engines (Spark, Flink, Beam, Hive), languages (Scala, Python, SQL) and execution models (stream, batch, interactive) to process petabytes of data. Due to this complex matrix of technologies and thousands of datasets, engineers spend considerable time learning about different data sources, formats, programming models, APIs, optimizations, etc. which impacts time-to-market (TTM).
To solve this problem and to make product development more effective, PayPal Data Platforms developed “Gimel”, an open source, unified analytics data platform which provides access to any storage through a single unified data API and SQL, which are powered by a centralized data catalog.
At PayPal & any other data driven enterprise – data users & applications work with a variety of data sources (RDBMS, NoSQL, Messaging, Documents, Big Data, Time Series Databases), compute engines (Spark, Flink, Beam, Hive), languages (Scala, Python, SQL) and execution models (stream, batch, interactive) to process petabytes of data. Due to this complex matrix of technologies and thousands of datasets, engineers spend considerable time learning about different data sources, formats, programming models, APIs, optimizations, etc. which impacts time-to-market (TTM).
To solve this problem and to make product development more effective, PayPal Data Platforms developed “Gimel”, an open source, unified analytics data platform which provides access to any storage through a single unified data API and SQL, which are powered by a centralized data catalog.
Videos:
Presentation Slides:
Complete the form below to access the full overview:
.png)
Videos
In this talk, Ojus Save walks you through a demo of how to build AI applications on Zoom. This demo shows you an AI agent that receives transcript data from RTMS and then decides if it has to create action items based on the transcripts that are received.
In this talk, Sandeep Joshi, , Senior Manager at NVIDIA, shares how to accelerate the data access between GPU and storage for AI. Sandeep will dive into two options: CPU- initiated GPUDirect Storage and GPU-initiated SCADA.
Bin Fan, VP of Technology at Alluxio, introduces how Alluxio, a software layer transparently sits between application and S3 (or other object stores), provides sub-ms time to first byte (TTFB) solution, with up to 45x lower latency.