Which kind of EC2 instance is more recommended for use with Alluxio with applications like Presto/Spark? Does it make a big difference to have EBS disks with IOPS?
Some compute frameworks like Apache Spark have single node-caching, how is Alluxio different than single-node caching?
How does Cloudera’s hybrid cloud approach work and how does it compare with Alluxio’s “zero-copy” bursting approach?
If you have a hybrid cloud architecture, using either a VPN or a dedicated high-speed circuit, does the network speed become a bottleneck in the hybrid data use case?
How does replication in Alluxio happen across worker nodes? Is the unit of replication a file or a block?