Hello,
As a follow-up to this post, I was curious about comparing the performance of the StableBTreeMap and SQLite used as pure key-value stores. Essentially, I am comparing StableBTreeMap against the indexing mechanism of SQLite, which is also a BTree.
In this test setup I was filling up some simple data: a u64 ID is mapping to a user name (just a payload of a 100 bytes String).
My benchmark setup was to first insert 1M elements with the increasing IDs, then benchmark the sequential insertion of more elements in the middle of that set.
The table shows the corresponding instruction costs, I have also added estimations for the in-memory journal and the journaling switched off entirely:
| New Elements | StableBTree | SQLite | SQLite In-Memory J | SQLite No Journaling |
|---|---|---|---|---|
| 1 | 49359 | 407462 | 202686 | 149520 |
| 10 | 49359 | 407462 | 202686 | 149520 |
| 50 | 256551 | 453089 | 248313 | 195171 |
| 100 | 548249 | 529542 | 324840 | 271660 |
| 1000 | 5706594 | 2334600 | 2064459 | 1993575 |
| 10000 | 57301658 | 20441029 | 18909213 | 18612903 |
Observations:
- BTreeMap works faster for inserting a small number of new elements. SQLite is slower at first, but then it scales a bit better if you need to insert more than a 100 elements in one go.
- SQLite setup with the In Memory journal is faster, but that benefit is not as pronounced as the number of inserted elements grows.
- If you are always inserting at the end of the BTree (the ID is always bigger than any other ID used before), BTreeMap slows down, while SQLite speeds up for some reason.
- It was interesting to see how much time we can win without journaling. In general, switching off journaling is not a safe or recommended setting, it also disallows undoing your transactions.
For reference, see the benchmark repository.
Setup Summary:
BTreeMap setup:
static MAP: RefCell<BTreeMap<u64, String, VirtualMemory<DefaultMemoryImpl>>> = RefCell::new(
BTreeMap::init(
MEMORY_MANAGER.with(|m| m.borrow().get(MemoryId::new(0))),
)
);.
SQLite setup:
CREATE TABLE users ( id INTEGER PRIMARY KEY AUTOINCREMENT, username TEXT NOT NULL)
The insertion logic is implemented as follows:
fn add_users_btree(offset: u64, increment: u64, count: u64) {
PAYLOAD.with_borrow(|payload| {
MAP.with_borrow_mut(|map| {
for_each_user(offset, increment, count, |id| {
map.insert(id, payload.clone());
});
})
})
}
fn add_users_sqlite(offset: u64, increment: u64, count: u64) {
PAYLOAD.with_borrow(|payload| {
with_connection(|mut conn| {
let tx = conn.transaction().unwrap();
let sql = String::from("insert into users (id, username) values (?, ?);");
{
let mut stmt = tx.prepare_cached(&sql).unwrap();
for_each_user(offset, increment, count, |id| {
stmt.execute((id, payload))
.expect("insert of a user failed!");
});
}
tx.commit().expect("COMMIT USER INSERTION FAILED!");
})
})
}
…
I have also tried benchmarking read operations. Both BTreeMap and SQLite were quite fast and took about 40K instructions for reading a single element. But it is more difficult to get an objective comparison here. SQLite reads data in pages, it you are lucky, the page is already in the heap, then reading is faster. If your multiple rows of data are scattered across the database pages, reading will work slower (but you can still use VACUUM to make it work faster). If you try to read an element after closing DB connection, the instruction count will bump to 2M instructions.
So, if you think I can improve the comparison, check out the repository and/or leave feedback. ![]()
