Rocker - rocksdb driver for Erlang

Introduction


There is a lot of information and disputes on the Internet regarding the choice of the sql / nosql approach, as well as the pros and cons of this or that KV-storage. What you are reading now is not a guide to rocksdb or campaigning for using this storage and my driver for it. I would like to share the intermediate result of the work on optimizing the NIF development process for Erlang. This article presents a workable driver for rocksdb, developed for a couple of evenings.


So, in one of the projects there was a problem of reliable processing of a large volume of events. Each event takes from 50 to 350 bytes, more than 80 million events are generated per node per day. Just want to note that the issues of resiliency of message delivery to the nodes are not considered. Also, one of the limitations of processing is the atomic and consistent change of a group of events.


Thus, the main requirements for the driver are:


  1. Reliability
  2. Performance
  3. Security (in a canonical sense)
  4. Functionality:
    • All basic kv functions
    • Column families
    • Transactions
    • Data compression
    • Support for flexible storage configuration
  5. Minimum code base

Review of existing solutions



After a cursory analysis of the current erlang drivers for rocksdb, it became clear that none of them fully complied with the project requirements. Although erlang-rocksdb could have been used, a couple of free evenings appeared, and after successful development and implementation of the Bloom filter on Rust, and curiosity: is it possible to implement all the requirements of the current project and implement most of the functions in NIF in a short period of time?


Rocker


Rocker is a NIF for Erlang, using Rust rocksdb wrapper. Key features are security, performance, and a minimum code base. Keys and data are stored in binary form, which does not impose any restrictions on the storage format. At the moment, the project is suitable for use in third-party solutions.
The source code is in the project repository .


API Overview


Opening the base


Working with the base is possible in two modes:


  1. Common key space. In this mode, all your keys will be placed in one set. Rocksdb allows you to flexibly configure storage options for current tasks. Depending on them, the base can be opened in two ways:


    • using the standard set of options


      rocker:open_default(<<"/project/priv/db_default_path">>) -> {ok, Db}. 

      The result of this operation will be a pointer to work with the base, and the base will be blocked for any other attempts to open. The base will be automatically unlocked immediately after clearing this pointer.


    • or set options for the task
       {ok, Db} = rocker:open(<<"/project/priv/db_path">>, #{ create_if_missing => true, set_max_open_files => 1000, set_use_fsync => false, set_bytes_per_sync => 8388608, optimize_for_point_lookup => 1024, set_table_cache_num_shard_bits => 6, set_max_write_buffer_number => 32, set_write_buffer_size => 536870912, set_target_file_size_base => 1073741824, set_min_write_buffer_number_to_merge => 4, set_level_zero_stop_writes_trigger => 2000, set_level_zero_slowdown_writes_trigger => 0, set_max_background_compactions => 4, set_max_background_flushes => 4, set_disable_auto_compactions => true, set_compaction_style => universal }). 

  2. Breakdown into several spaces. Keys are stored in the so-called column families, and each column family can have different options. Consider the example of opening a database with standard options for all column families
     {ok, Db} = case rocker:list_cf(BookDbPath) of {ok, CfList} -> rocker:open_cf_default(BookDbPath, CfList); _Else -> CfList = [], rocker:open_default(BookDbPath) end. 

Base removal


To correctly remove the database, you must call rocker:destroy(Path). In this case, the base should not be used.


Base recovery after failure


In the event of a system failure, the base can be restored using the rocker:repair(Path) method. This process consists of 4 steps:


  1. file search
  2. restoring tables by playing WAL
  3. metadata retrieval
  4. handle record

Creating column family


 Cf = <<"testcf1">>, rocker:create_cf_default(Db, Cf) -> ok. 

Remove column family


 Cf = <<"testcf1">>, rocker:drop_cf(Db, Cf) -> ok. 

CRUD operations


Write data by key

 rocker:put(Db, <<"key">>, <<"value">>) -> ok. 

Receiving data by key

 rocker:get(Db, <<"key">>) -> {ok, <<"value">>} | notfound 

Deleting data by key

 rocker:delete(Db, <<"key">>) -> ok. 

Writing data by key within CF

 rocker:put_cf(Db, <<"testcf">>, <<"key">>, <<"value">>) -> ok. 

Data acquisition by key within CF

 rocker:get_cf(Db, <<"testcf">>, <<"key">>) -> {ok, <<"value">>} | notfound 

Deleting data by key within CF

 rocker:delete_cf(Db, <<"testcf">>, <<"key">>) -> ok 

Iterators


As you know, one of the basic principles of rocksdb is the orderly storage of keys. This feature is very useful in real-world tasks. To use it we need data iterators. In rocksdb, there are several modes of walking through data (detailed code examples can be found in tests ):



It is worth noting that these modes also work to pass through the data stored in column families.


Create an iterator

 rocker:iterator(Db, {'start'}) -> {ok, Iter}. 

Iterator check

 rocker:iterator_valid(Iter) -> {ok, true} | {ok, false}. 

Create an iterator for CF

 rocker:iterator_cf(Db, Cf, {'start'}) -> {ok, Iter}. 

Create a prefix iterator

The prefix iterator requires explicitly specifying the prefix length when creating the database.


 {ok, Db} = rocker:open(Path, #{ prefix_length => 3 }). 

An example of creating an iterator with the prefix “aaa”:


 {ok, Iter} = rocker:prefix_iterator(Db, <<"aaa">>). 

Create a prefix iterator for CF

Similar to the previous prefix iterator, requires explicit prefix_length for column family


 {ok, Iter} = rocker:prefix_iterator_cf(Db, Cf, <<"aaa">>). 

Getting the next item

The method returns the following key / value, or ok if the iterator has completed.


 rocker:next(Iter) -> {ok, <<"key">>, <<"value">>} | ok 

Transactions


A fairly frequent occurrence is the requirement of simultaneously writing changes to a group of keys. Rocker allows you to combine CRUD operations both within a common set and within CF.
This example illustrates working with transactions:


 {ok, 6} = rocker:tx(Db, [ {put, <<"k1">>, <<"v1">>}, {put, <<"k2">>, <<"v2">>}, {delete, <<"k0">>, <<"v0">>}, {put_cf, Cf, <<"k1">>, <<"v1">>}, {put_cf, Cf, <<"k2">>, <<"v2">>}, {delete_cf, Cf, <<"k0">>, <<"v0">>} ]). 

Performance


In the test suite you can find a performance test. It shows about 30k RPS for writing and 200k RPS for reading on my machine. In real conditions, you can expect 15-20k RPS per write and about 120k RPS per read, with an average data size of about 1 KB per key and the total number of keys more than 1 billion.


Conclusion


The development and application of Rocker in our project allowed us to reduce the response time of the system, increase reliability, and reduce the restart time. These advantages were obtained with minimal development and implementation costs.


Personally for myself, I concluded that for Erlang projects that require optimization, the application of Rust is optimal. On Erlang, 95% of the code can be quickly and efficiently implemented, and on Rust, it is possible to rewrite / add 5% braking without reducing the overall system reliability.


PS There is a positive experience in developing NIF for Arbitrary-precision arithmetic in Erlang, which can be made into a separate article. I would like to clarify, is the topic of the NIF interesting to the Rust community?

Source: https://habr.com/ru/post/413353/


All Articles