Small (kilobyte-sized) objects are the bane of highly scalable cloud object stores. Larger (at least megabytesized) objects not only improve performance, but also result in orders of magnitude lower cost, due to the current operation-based pricing model of commodity cloud object stores. For example, in Amazon S3’s current pricing scheme, uploading 1GiB data by issuing … Continued
TalkingData’s largest data broker, provides data intelligence solutions and processes over 20 terabytes of data and more than one billion session requests per day. TalkingData deployed Alluxio to unify disparate cloud, on-premise, and hybrid data sources for a range of analytics applications. The architecture provides self-service data access for data scientists and engineers, eliminating the … Continued
Alluxio presents a set of disparate data stores as a single file system, greatly reducing the complexity of storage APIs, and semantics exposed to applications. Alluxio is designed with a memory centric architecture, enabling applications to leverage memory speed I/O by simply using Alluxio. Alluxio has been deployed at hundreds of leading companies in production, … Continued
Tencent, based in China, is one of the largest technology companies in the world and a leader in sectors such as social networking, gaming, ecommerce, mobile, and web portal. Tencent News provides a rich, tailored news experience to over 100 million active monthly users. In order to meet the strict Service Level Agreements (SLAs) required … Continued
Enterprises are adopting big data technologies to analyze and derive insight from their growing volumes of structured and unstructured data. A familiar problem is the requirement to analyze data from multiple independent storage silos concurrently. In order to consolidate the data, large enterprises typically use custom solutions or build a data lake. These approaches present additional challenges and can be costly and time consuming.
Alluxio, formerly Tachyon, is a memory speed virtual distributed storage system and leverages memory for storing data and accelerating access to data in different storage systems. Many organizations and deployments use Alluxio with Apache Spark, and some of them scale out to over PB’s of data. Alluxio can enable Spark to be even more effective, … Continued
In a real development environment our customers leverage ArcGIS to read and write geospatial data to a plethora of distributed data stores, such as Amazon S3, HDFS, or OpenStack Swift, and some of these data stores are not natively supported by the ArcGIS platform…