Alluxio Featured Speaker and Bronze Sponsor at Data + AI Summit

AITechPark

Session will present data caching strategies for data analytics and AI; attend demonstration of its open source data platform in booth #29

Alluxio, the developer of the open source data platform that simplifies the management of large-scale analytics and AI/ML applications, today announced its participation at Data + AI Summit, taking place June 26 – 29, 2023 at Moscone Center, San Francisco, CA and virtually. Alluxio will also showcase its open source data platform in booth #29 during the event.

Navigating Cloud Costs and Egress: Insights on Enterprise Cloud Conversations

Cloud Data Insights

Cloud conversations are evolving. Where there was once optimism that the cloud offered “the solution” to digital transformation challenges, more companies now consider it to be merely one tool in the toolbox. As such, we’re seeing more willingness to dive into hard conversations about cloud costs and the types of cloud architectures best for enterprise needs. In this interview, Adit Madan, Director of Product Management at Alluxio discusses with Elisabeth Strenger of CDInsights the changing dynamics of cloud conversations at the enterprise level, with a focus on the increased attention given to cloud costs and spending predictability.

Controlling Cloud Egress Fees With Hybrid Cloud Data Access | Alluxio

TFiR

Storing data in the cloud is cheap. Cloud providers do this to incentivize enterprises to move all their data to the cloud so that they can use the different compute services that they provide. However, every time the data moves across regions of the cloud, or the data moves out of the cloud (when accessed by on-premise data centers or by a different cloud), cloud providers charge an egress fee based on the amount of traffic that moves across the network.

DBTA 100 2023: The Companies That Matter Most in Data

Database Trends & Applications

The need to balance data safety with new data initiatives, deliver business value, and change company culture around data tops this year’s list of data and analytics management challenges.

How to Orchestrate Data for Machine Learning Pipelines

Hackernoon

Machine learning (ML) workloads require efficient infrastructure to yield rapid results. Model training relies heavily on large data sets. Funneling this data from storage to the training cluster is the first step of any ML workflow, which significantly impacts the efficiency of model training. This article will discuss a new solution to orchestrating data for end-to-end machine learning pipelines that addresses the above questions. I will outline common challenges and pitfalls, followed by proposing a new technique, data orchestration, to optimize the data pipeline for machine learning.

Heard on the Street – 5/15/2023

InsideBigData

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!

Storage news ticker – May 12

Blocks & Files

Alluxio has published a Presto Optimization Handbook, downloadable here; Presto being a distributed query engine for data analytics. For customers using Trino (formerly PrestoSQL), check out The Trino Optimization Handbook here