In this online presentation, we present how ING is leveraging Presto (interactive query), Alluxio (data orchestration & acceleration), S3 (massive storage), and DC/OS (container orchestration) to build and operate our modern Security Analytics & Machine Learning platform. We will share the challenges we encountered and how we solved them.
This event features leading financial services company ING Bank’s user story on how they leverage open source technologies like Presto and Alluxio with S3.
The goal is to make Alluxio accessible to an even wider set of users through a focus on security, new language bindings, and further increased stability. In addition, the team is working on new APIs to allow applications to access data more efficiently and manage data across different under storage systems.
In this presentation, William Callaghan will focus on the challenges faced and lessons learned in building a human-in-the loop cyber threat analytics pipeline. They will discuss the topic of analytics in cybersecurity and highlight the use of technologies such as Spark Streaming/SQL, Cassandra, Kafka and Alluxio in creating an analytics architecture with missions-critical response times.
Impersonation is simply the ability for one user to act on behalf of another user. For example, say user ‘yarn’ has the credentials to connect to a service, but user ‘foo’ does not. Therefore, user ‘foo’ would never be able to access the service. However, user ‘yarn’ can access the service and impersonate (act on behalf of) user ‘foo’, allowing access to user ‘foo’. Therefore, impersonation enables one user to access a service on behalf of another user.
The impersonation feature defines how users can act on behalf of other users. Therefore, it is important to know who the users are.
Lenovo is an Alluxio customer with a common problem and use case in the world of data analytics. They have petabytes of data in multiple data centers in different geographic locations. Analyzing it requires an ETL process to get all of the data in the right place. This is both slow, because data has to be transferred across the network, and costly because multiple copies of the data need to be stored. Freshness and quality of the data can also suffer as the data is also potentially out of date and incomplete because regulatory issues prevent certain data from being transferred.