Tachyon 0.5.0 Release

We are happy to announce Tachyon v0.5.0, a major release with lots of stability improvements, new features, and a number of bug fixes!

Downloads for the release can be found here: Downloads.

Separated Jars and Improved Application Testing Support

Tachyon’s top priority is to provide better support for frameworks and applications on top of Tachyon. In this release, there are two major changes related to this. First, we separated out a tachyon-client jar, which has less dependency than the whole Tachyon project. This can significantly reduce the potential overhead of an application’s jar hell. Second, we also separated a tachyon-test jar to allow applications to run a local tachyon cluster in their unit tests.

UnderFS Integration Testing Infrastructure

Tachyon aims to accelerate different kinds of under file systems. To provide solid integration, we added test infrastructure to test the system with different under file systems. For example, a user can run mvn -Dtest.profile=hdfs -Dhadoop.version=2.4.0 test to test Tachyon with Apache HDFS 2.4.0 as its under file system, or mvn -Dtest.profile=local test to test Tachyon with local file system as its under file system.

Pin/UnPin Feature

Pin/UnPin a file/folder is an important usage scenario. For example, a workload may have different frequently accessed datasets every day. In this release, we added pin/unpin operations. Applications can use this API to keep files or folders in memory explicitly.

GlusterFS Support

Users run Tachyon with different under file systems, such as Apache HDFS, S3. People also asked about running Tachyon with other under file systems. From this release, Tachyon is able to run out of the box on top of GlusterFS. Supporting other file systems is on the way. Please let us know your needs on the user list.

Readable Journal Log

The Tachyon Write-Ahead Journal is vital to the reliability of the system. From version 0.5.0, Tachyon begins to use JSON in its journaling system. It significantly improves the readability of the journal log, and therefore makes the system easier to reason about. In the meantime, it also makes features such as transparent upgrade easier and less error prone in the future. In addition, by optimizing the code, the performance of journaling actually improves.

Other Improvements

  • Documentation and Java Doc improvements
  • GetLocalFileName API for advanced users to access in-memory data
  • Load UnderFS tool improvement
  • CLI improvements
  • Various bug fixes
  • Metadata operation performance improvement
  • Scripts improvements

Looking forward: What’s coming in 0.6.0?

  • More UnderFS Integration (Redhat)
  • Hierarchical Local Storage (Intel)
  • Performance improvement (Yahoo)
  • Many more exciting features from AMPLab and industry collaborators

Recent News

  • Tachyon has a new Logo!
  • Apache Spark 1.0.0 uses Tachyon as its default off-heap storage solution.
  • More than 40 people from over 15 companies have contributed to the project! Various companies have committed full time employees to work on Tachyon.
  • Besides being deployed at different companies, Tachyon is also running in xPatterns Big Data Analytics Platform, and is used to share data between Apache Spark and H2O.
  • The project is featured on AMPLab’s homepage.
  • We formed a Tachyon meetup group in the bay area to facilitate the communication among Tachyon users and developers. Stay tuned for our first event!

Acknowledgements

We would like to thank David Capwell, Chao Chen, Cheng Chang, Huamin Chen, Timothy St. Clair, Aaron Davidson, Qianhao Dong, Ali Ghodsi, Manu Goyal, Rong Gu, Calvin Jia, Cheng Hao, Grace Huang, Lukasz Jastrzebski, Anurag Khandelwal, Nick Lanham, Du Li, Haoyuan Li, Raymond Liu, Colin Patrick McCabe, Robert Metzger, Henry Saputra, Joseph Tang, Fei Wang, Tao Wang, Pengfei Xuan, Gerald Zhang for their contributions to this release.