Alluxio 2.9 major new enhancements include scaling out with tenant-level isolation, simplifying DevOps on Kubernetes, and strengthening the security of S3 API
SAN MATEO, CA – November 16, 2022 – Alluxio, the developer of the open source data orchestration platform for data driven workloads such as large-scale analytics and AI/ML, today announced the immediate availability of version 2.9 of its Data Orchestration Platform. This new release strengthens its position as the key layer between compute engines and storage systems by delivering support for a scale-out, multi-tenant architecture with a new cross-environment synchronization feature, enhanced manageability with significant improvement in the tooling and guidelines for deploying Alluxio on Kubernetes, and improved security and performance with a strengthened S3 API and POSIX API.
“We are running one thousand nodes of Alluxio to optimize model training jobs and interactive queries,” said Peng Chen, Engineer Manager in the Big Data Team, Tencent. “Alluxio has become the de-facto choice for large internet companies to accelerate the development of their data analytics and AI applications. We are excited about the enhanced Kubernetes feature of the new release, which will make managing Alluxio even easier.”
“We have been using Alluxio as the data cache layer on top of multiple data centers to speed up the data access performance,” said Luo Li, Director of Data Infrastructure, Shopee. “Alluxio’s architecture enables us to support data ‘servitization.’ Furthermore, Alluxio has reduced our data infrastructure team’s management overhead, especially for data distributed in multiple data centers, or even across countries.”
“Tenant-dedicated satellite clusters have become more common while architecting data platforms,” said Adit Madan, Director of Product Management, Alluxio. “Alluxio’s ability to actively synchronize metadata across multiple environments is significant, making the adoption of such an architecture easier than ever.”
Tenant isolation provides the scale and economic benefits of a multi-tenant architecture while rigorously preventing different teams from competing for access to shared data lake storage. With the new cross-environment synchronization feature, Alluxio evolves its architecture with significantly improved scalability and manageability enabling data platform teams to deploy multiple per-tenant Alluxio clusters between compute and storage cluster across any environment, based on workload capacity. Running Alluxio on Kubernetes helps standardize deployment methodologies across cloud, multi-cloud, hybrid-cloud, and on-premises environments. This new release introduces the Alluxio operator, which simplifies deploying, configuring, provisioning, and managing multiple Alluxio clusters, reducing DevOps complexity. Alluxio on Kubernetes also makes data stack portable to any environment, preventing vendor lock-in. Lastly, in Alluxio 2.9, authentication and access policies are now centralized through the communications between compute engines and Alluxio via S3 API. Therefore, Alluxio provides a unified security experience across heterogeneous storage either on-premise or in the cloud.
“Alluxio’s data orchestration platform aims to simplify, secure, and accelerate data access in heterogeneous analytics environments,” said Kevin Petrie, VP of Research, Eckerson Group. “These v2.9 enhancements seek to give new analytics users, applications, and projects the resources they need, with less effort and higher confidence in meeting SLAs. Alluxio does this by helping enterprises manage metadata, containerized deployments, and the security of its APIs more effectively.”
Alluxio 2.9 Community and Enterprise Edition features new capabilities, including:
Multi-Environment Cluster Synchronization
Alluxio 2.9 introduces the new cross-environment synchronization feature. This feature makes one Alluxio cluster aware of another Alluxio cluster by automatically syncing the metadata between Alluxio clusters. Deploying Alluxio clusters across any environment can achieve tenant-level isolation with the metadata of Alluxio clusters in sync at scale. This feature is particularly useful when adopting satellite architecture with compute clusters segregated across team-level tenants for isolation. With this new feature, multi-tenant architecture with Alluxio allows the platform to scale out and onboard new use cases without a central resource bottleneck, ensuring SLAs and simplifying metadata management operations.
Extended Manageability for Kubernetes
The new Alluxio 2.9 has added the Alluxio operator for Kubernetes. Administrators can now deploy and manage Alluxio on Kubernetes easily through the newly introduced Alluxio operator with CRD (custom resource definitions). The operator offers configuration management for deployment, connections to under storage, configuration updates, and uninstallation. Using the Alluxio operator removes the burden of deploying Alluxio on different environments, greatly reduces the amount of manual work and simplifies DevOps when managing multiple instances of Alluxio.
Enhanced S3 API Security with Better User Experience
Alluxio 2.9 further strengthens its S3 API providing a unified security model to applications with better user experience. By adopting the open authentication protocol for S3 API, Alluxio users will be verified before their requests are processed. This new feature allows data platform teams to connect to more advanced identity management systems (such as PingFederate) and leverage Single-Sign on (SSO) to enhance user experience. With a uniform authentication and authorization model, applications connected to Alluxio are portable across on-premises, hybrid or multi-cloud.
Free downloads of Alluxio 2.9 open source Community Edition and trials of Alluxio Enterprise Edition are immediately available here: https://www.alluxio.io/download/
- Check out these case studies to learn more about the multi-environment data platform architecture:
- Expedia Group has implemented Alluxio to federate cross-region data lakes in AWS. Alluxio unifies geo-distributed data silos without replication, enabling consistent and high performance with about 50% reduced costs: https://www.alluxio.io/blog/unifying-cross-region-access-in-the-cloud-at-expedia-group-the-path-toward-data-mesh-in-the-brand-world/
- A Fortune 50 technology company has successfully implemented Alluxio to achieve a hybrid-cloud strategy, become multi-cloud ready, cut costs, and boost agility: https://www.alluxio.io/app/uploads/2022/10/Fortune-50-Case-Study-2-pager.pdf
- Join the webinar to get a deep dive into Alluxio 2.9: https://us06web.zoom.us/webinar/register/WN_OhwAKADTQbenc9AZIErXFA
- Come visit Alluxio at PrestoCon
- Register now: https://events.linuxfoundation.org/prestocon/register/
- Join the session on 1:45 pm – 2:20 pm PT on Thursday, December 8: https://sched.co/1CzYl
Tweet this: @AlluxioIO re-imagines architecture for multi-tenant environments at scale #cloud #opensource #analytics #AI https://bit.ly/3NqQIub
Proven at global web scale in production for modern data services, Alluxio is the developer of open source data orchestration software for the cloud. Alluxio moves data closer to data analytics and machine learning compute frameworks in any cloud across clusters, regions, and clouds, providing memory-speed data access to files and objects. Intelligent data tiering and caching deliver greater performance and reliability to customers in financial services, high tech, retail and telecommunications. Alluxio is in production use today at eight out of the top ten internet companies. Venture-backed by Andreessen Horowitz, Seven Seas Partners, Volcanic Ventures, and Hillhouse Capital. Alluxio was founded at UC Berkeley’s AMPLab by the creators of the Tachyon open source project. For more information, contact email@example.com or follow us on LinkedIn, or Twitter.
Winkowski Public Relations, LLC for Alluxio