Facebook is pushing forward with open sourcing the applications it uses, and today released the code for its RocksDB low-latency embedded persistent key-value database.
RocksDB attaches directly to application servers, which reduces latency compared to accessing data stored on networked database, according to Facebook Hadoop engineer Dhruba Borthakur.
While RocksDB is based on Google's open source LevelDB, Facebook found that the latter did not perform well when the entire database couldn't fit into main memory.
Facebook modified LevelDB and improved the write rate ten-fold through multi-threaded data compaction that uses all available processor cores.
Thread-aware compaction reduced applications stalls that saw latencies spike to the tens of seconds, and Facebook also substantially reduced read and write amplification in RocksDB.
Coders at the social network also broke up RocksDB into a pluggable architecture, allowing for instance different modules for data compression to be used without any changes to the code itself, Borthakur wrote.
Thanks to the code changes, RocksDB scales linearly with the number of processors and can utilise 64 or more cores. It also scales linearly with increased storage IOPS (input/outputs per second) and can take advantage of fast Flash memory as well as dynamic and non-volatile RAM.
Some of the uses for RocksDB include cacheing data from Hadoop, for real-time queries, and applications that need fast access to big data sets.
Being a machine attached database, Facebook warned that RocksDB is not distributed and doesn't have a failover mechanism, or high-availability features. If the system it is attached goes down, RocksDB will lose the data stored in it.
The code for RocksDB is available on the Github repository as a C++ library.