Ic-stable-memory rust library

@senior.joinu I tried it and ran into this bug.

image

Opened an issue for it here

There’s a minimal reproduction in the issue. Let me know if there’s any other detail I can provide :slight_smile:

1 Like

Hey there!

Fixed. Check the issue for more details.

1 Like

Hi @senior.joinu I ran into another issue with SPrincipal. Created a separate issue for it here with a minimal reproduction

Let me know if there’s any further details I can provide :slight_smile:

1 Like

Hello again!

New version of the library with this issue fixed is available.
Thanks!

1 Like

Thank you. Works fine as far as I’ve tested. :slight_smile:

1 Like

0.4.0-rc1

Hey there! Huge stability, performance & other updates here.
ic-stable-memory 0.4.0 release candidate #1 is now published.

The goal for this version was to improve the library so it would be reasonable to use it instead of standard collections. Previous versions had a lot of problems with performance (some collections was 1.5k times slower than the standard ones). But this time everything is much better.

Cons

First, let’s talk about stuff that some might find upsetting.

  1. Collections from 0.4.0-rc1 are not backwards compatible with older ones. This is because they now internally use a custom super-fast (up to 5 times faster than Speedy) serialization engine and because of that they store data differently. If you need a migration guide, please let me know (but I hope that most people simply don’t use ic-stable-memory in production).

  2. ic-stable-memory is now a nightly only library, because it strongly relies on a couple of unstable features, namely: generic_const_exprs and thread_local. So now, in order to use the library you have to install rust 1.66 nightly toolchain and also append these lines to the beginning of your lib.rs:

#![feature(thread_local)]
#![feature(generic_const_exprs)]
  1. ic-stable-memory no longer takes care of horizontal scaling - I found out that developers themselves should be able to do a much better job doing it for their exact use-case. So max_allocation_pages and max_grow_pages settings are removed as well as on_low_stable_memory hook. See this thread for more info on how to enable horizontal scaling for your canister.

  2. Most collections (except SBTreeMap and SBTreeSet) are no longer “infinite” - they are now limited to 2**32 elements total. I’m planning on adding an “infinite” SVec back in a near future.

Pros

Performance improvements

Vec

Before

"Classic vec push" 1000000 iterations: 463 ms
"Stable vec push" 1000000 iterations: 22606 ms (x49 slower)

"Classic vec pop" 1000000 iterations: 406 ms
"Stable vec pop" 1000000 iterations: 11338 ms (x28 slower)

"Classic vec search" 1000000 iterations: 127 ms
"Stable vec search" 1000000 iterations: 2926 ms (x23 slower)

After

"Classic vec push" 1000000 iterations: 46 ms
"Stable vec push" 1000000 iterations: 212 ms (x4.6 slower)

"Classic vec search" 1000000 iterations: 102 ms
"Stable vec search" 1000000 iterations: 151 ms (x1.4 slower)

"Classic vec pop" 1000000 iterations: 48 ms
"Stable vec pop" 1000000 iterations: 148 ms (x3 slower)

Hash map

Before

"Classic hash map insert" 100000 iterations: 224 ms
"Stable hash map insert" 100000 iterations: 7199 ms (x32 slower)

"Classic hash map remove" 100000 iterations: 123 ms
"Stable hash map remove" 100000 iterations: 3618 ms (x29 slower)

"Classic hash map search" 100000 iterations: 69 ms
"Stable hash map search" 100000 iterations: 2325 ms (x34 slower)

After

"Classic hash map insert" 100000 iterations: 96 ms
"Stable hash map insert" 100000 iterations: 387 ms (x4 slower)

"Classic hash map search" 100000 iterations: 47 ms
"Stable hash map search" 100000 iterations: 113 ms (x2.4 slower)

"Classic hash map remove" 100000 iterations: 60 ms
"Stable hash map remove" 100000 iterations: 99 ms (x1.6 slower)

Hash set

Before

"Classic hash set insert" 100000 iterations: 209 ms
"Stable hash set insert" 100000 iterations: 5977 ms (x28 slower)

"Classic hash set remove" 100000 iterations: 180 ms
"Stable hash set remove" 100000 iterations: 2724 ms (x15 slower)

"Classic hash set search" 100000 iterations: 125 ms
"Stable hash set search" 100000 iterations: 2007 ms (x16 slower)

After

"Classic hash set insert" 100000 iterations: 79 ms
"Stable hash set insert" 100000 iterations: 394 ms (x4.9 slower)

"Classic hash set search" 100000 iterations: 54 ms
"Stable hash set search" 100000 iterations: 97 ms (x1.8 slower)

"Classic hash set remove" 100000 iterations: 56 ms
"Stable hash set remove" 100000 iterations: 99 ms (x1.7 slower)

BTree map

Before

"Classic btree map insert" 10000 iterations: 31 ms
"Stable btree map insert" 10000 iterations: 8981 ms (x298 slower)

"Classic btree map remove" 10000 iterations: 17 ms
"Stable btree map remove" 10000 iterations: 19831 ms (x1166 slower)

"Classic btree map search" 10000 iterations: 15 ms
"Stable btree map search" 10000 iterations: 20710 ms (x1380 slower)

After

"Classic btree map insert" 100000 iterations: 267 ms
"Stable btree map insert" 100000 iterations: 17050 ms (x63 slower)

"Classic btree map search" 100000 iterations: 138 ms
"Stable btree map search" 100000 iterations: 566 ms (x4.1 slower)

"Classic btree map remove" 100000 iterations: 147 ms
"Stable btree map remove" 100000 iterations: 1349 ms (x9.1 slower)

BTree set

Before

"Classic btree set insert" 10000 iterations: 26 ms
"Stable btree set insert" 10000 iterations: 8920 ms (x343 slower)

"Classic btree set remove" 10000 iterations: 13 ms
"Stable btree set remove" 10000 iterations: 19601 ms (x1507 slower)

"Classic btree set search" 10000 iterations: 16 ms
"Stable btree set search" 10000 iterations: 20569 ms (x1285 slower)

After

"Classic btree set insert" 100000 iterations: 312 ms
"Stable btree set insert" 100000 iterations: 1771 ms (x5.6 slower)

"Classic btree set search" 100000 iterations: 170 ms
"Stable btree set search" 100000 iterations: 600 ms (x3.5 slower)

"Classic btree set remove" 100000 iterations: 134 ms
"Stable btree set remove" 100000 iterations: 1317 ms (x9.8 slower)

Binary heap

Before

"Classic binary heap push" 1000000 iterations: 995 ms
"Stable binary heap push" 1000000 iterations: 29578 ms (x29 slower)

"Classic binary heap pop" 1000000 iterations: 4453 ms
"Stable binary heap pop" 1000000 iterations: 27159 ms (x6 slower)

"Classic binary heap peek" 1000000 iterations: 133 ms
"Stable binary heap peek" 1000000 iterations: 3314 ms (x25 slower)

After

"Classic binary heap push" 1000000 iterations: 461 ms
"Stable binary heap push" 1000000 iterations: 11668 ms (x25 slower)

"Classic binary heap peek" 1000000 iterations: 62 ms
"Stable binary heap peek" 1000000 iterations: 144 ms (x2.3 slower)

"Classic binary heap pop" 1000000 iterations: 715 ms
"Stable binary heap pop" 1000000 iterations: 16524 ms (x23 slower)

As you can see, in some cases performance gains are obvious, but for some (like SBinaryHeap) there is not much changed. This is because some algorithms are sensitive to indirect data and some - don’t. Let me explain what I mean.

Direct & indirect storage

Previously, all collections in ic-stable-memory were indirect - for each element they were allocating a new portion of stable memory to store it, but inside they were only storing a pointer to that stable memory. This is what allowed these collections to store any kind of dynamically sized data inside and to be able to migrate this data to new version (to add new fields) on-the-fly. But this also was a big problem from the performance perspective, since such a flow requires (de)allocating and (de)serializing each element.

This is why now all stable collections are optionally indirect. Now, there is a special trait StableAllocated. This trait should only be implemented for fixed-size data (for something that can easily implement Copy). If your data implements this trait, you can pass it straight to a stable collection like so:

let vec = SVec::<MyData>::new();

and the collection will work in high-performance direct mode, storing data inline, right inside the collection. In this case your data will be serialized with the custom serialization engine, that I’ve mentioned before. This engine serializes everything without leaving the stack, straight to a fixed-size byte array. Which makes it a lot faster than any other serialization engine, since all of them are serialize into a heap-allocated buffer.

But if your data is dynamically-sized (or you have a plan to update this data to a new version one day, adding new fields) than you have to store it indirectly. For that there is a couple of new smart-pointers: SBox and SBoxMut:

let vec = SVec::<SBox<MyData>>::new();

In that case, the collection will, as before, allocate a new portion of stable memory somewhere else and internally will only store a pointer to that data. You can wrap any data that implements speedy::Readable and speedy::Writable in SBox or SBoxMut - speedy is responsible for the serialization is this case.

This indirection polymorphism feature was the longest thing to implement for me and this is why performance gains might not yet look like a big deal for some of you. Internally collections are changed drastically. This will allow for more better optimizations in the future, so don’t get upset - from now the library will only become faster and faster.

Stability

Test coverage is increased up to 95% and in reality it is a couple of percent higher, since cargo tarpaulin often shows some trivial lines as uncovered, while they are definitely covered (try it out yourself). I found and fixed a lot of bugs during this work and I hope the library now is much more safe to use than it was before.

But it is still a young software, please don’t use it in production yet!

Memory fragmentation

The allocator now also works a lot faster and it can reallocate memory in-place, which makes it a lot easier to resist fragmentation. Not much to say here. I ran some tests internally and it appears that fragmentation for a random dataset is not a big issue (around 5%).

QoL improvements

All collections are now also have .iter() method, which returns an iterator over that collection. For most of them this is a zero-cost very efficient abstraction, but not for BTree-based collections - their iterators are pretty heavy and should only be used for small collections.

SVec now has a lot of additional methods like insert(), remove(), from::<Vec<_>>() and binary_search_by().

There is also an additional macro for “un-declaring” a stable variable - s_remove!(Variable) - it will remove the variable from the registry and return it’s content. If the data that was stored there is droppable (if it is a stable collection), then you have to manually release the memory by calling .stable_drop() method on it:

let vec = s_remove!(MyStableVector);
unsafe { vec.stable_drop() }; 

Refactors

Once you try it out, you’ll notice that some methods changed their name (instead of .drop() - .stable_drop(); instead of .get_cloned() - .get_copy() and so on). Some methods that were previously safe (.from_ptr() or .drop()) are now marked as unsafe. This is because I found these new names and signatures more descriptive and decided to change them while it is still possible to do so without hurting too many people.

What’s next?

I decided to add this -rc1 postfix in order to focus your attention once again on the fact that this library is not yet ready to be used in production. My goal is to make version 0.4 as good as possible, before actually releasing it. To make it into a suitable replacement for standard collections.

Release candidate #2 will be focused around some (probably AVL) augmented binary Merkle tree, that will allow users of ic-stable-memory to easily certify data they store.

I also have a plan to add an infinite SVec back (probably will be named as SRope) as well as to figure out a way to improve BTree-based collections and the BinaryHeap, since their performance is still seems poor to me. This amount of work looks to me as another release candidate milestone.

After that there maybe more release candidates, with the main goal to stabilize the library as much as possible before telling everyone: “Yea, you can use it in you canister”.


Thanks for reading this far. Go try it out and tell me what you think!

By the way, the code for 0.4.0-rc1 is in develop, not in master.

10 Likes

Thank you so much for the amazing updates.

A migration path for canisters using ic-stable-memory v0.2.* would be amazing :slight_smile:

1 Like

I am not completely sure how this works. An example would be really helpful.

IF I have a struct Profile with name and then want to add another field age, how would already existing Profile structs co-exist with the new Profile with 2 fields in the same stable collection. If I get the value of an older struct, what would be returned in the age field?

I imagine any data structure containing a String type can’t utilize this, right?

In that case, looking at the scope, shouldn’t the dynamically sized data option be the default?

Thanks! Sure, you can expect such a guide in a couple of days.

You just have to declare your structures a little bit differently. Imagine intitially you have something like this:

#[derive(Readale, Writable)]
enum User {
  V1(UserV1)
}

#[derive(Readale, Writable)]
struct UserV1 {
  name: String,
  age: u8,
}

And you store those structs in some kind of stable collection:

type UsersMap = SHashMap<SPrincipal, SBox<User>>;
...
let mut users = s!(UsersMap);
let user_box: SBox<User> = users.get_copy(user_principal).unwrap();

// SBox and SBoxMut implement Deref, so you can simply &
match &user_box {
  User::V1(user) => {
    // do stuff
  }
}

Since you’re storing users indirectly, you can simply declare a new version like this:

#[derive(Readale, Writable)]
enum User {
  V1(UserV1),
  V2(UserV2),
}

...

#[derive(Readale, Writable)]
struct UserV2 {
  name: String,
  age: u8,
  phone: String,
}

And use it in your code like this:

match &user_box {
  User::V1(user) => {
    // do stuff
  },
  User::V2(user) => {
    // do stuff with phone number also
  }
}

This is both safe and pretty easy to do.

It depends. First of all, if your string is limited in size (for example, it can’t be bigger than 128 bytes) than you can simply convert your string to [u8; 128] and store it that way. StableAllocated is implemented by default for all primitive numeric types and for [u8] arrays with length equal to some power of two up to 4096 bytes (0, 1, 2, 4, 8, 16, 32, 64 ... 4096). Once generic_const_expr gets more stable it would be possible to implement it for all [u8] arrays of any size generically.

Second, you still can split your struct into a couple of structs - detach a fixed-size part from the dynamically-sized one. Imagine you have these structs:

struct TransactionHistoryEntryBase {
  from: SPrincipal, // SPrincipal also implements StableAllocated
  to: SPrincipal,
  amount: SNat, // SNat also implements StableAllocated,
}

// there is no derive macro for this yet 
impl StableAllocated for TransactionHistoryEntryBase {
  ...
}

#[derive(Readable, Writable)]
struct TransactionHistoryEntryDetails {
  memo: String,
  refs: Vec<u64>,
}

Together they represent a single business entity TransactionHistoryEntry but you can store it in different locations:

type THEBaseLedger = SVec<TransactionHistoryEntryBase>;
type THEDetailsLedger = SVec<SBox<TransactionHistoryEntryDetails>>;

and you can access them separately. You’ll still be able to update this data to a new version (by modifying the dynamically-sized part as I showed earlier), but at least some part of your data will be optimized.

And the third option is, as you said, to store everything indirectly using SBox-es.

About making an indirect flow to be the default one - yes, maybe you’re right, but I’m not sure yet. What I didn’t like about 0.2.0 is that this indirection of storage was hidden from the user. I find this very misleading - developers should know what their code is doing and how it works internally, preferably without having to read through the documentation. This is what I always liked about Rust itself - by default it always does things in the most optimal way.

In rust there is also a notion of indirection. By default, when you allocate some data and if this data is sized, it will be automatically allocated on stack (directly). And only if you explicitly say to allocate this data on heap (to switch to indirect mode) by putting it into a Box it will do so. My motivation was to replicate this behavior and to make the library more idiomatic this way.

4 Likes

Thank you for the explanations.

Makes perfect sense :slight_smile:

1 Like

@senior.joinu this looks amazing!

I’m keen to move all of the data in OpenStorage (OpenChat’s storage service) into stable memory.

One option is to use the StableStructures built by Dfinity, but that doesn’t work nicely when your values vary in size massively. I’d need to create buckets of increasing size which I want to avoid due to added complexity + memory wasted.

Our keys are all 256bit hashes and our values are vec’s of bytes of varying sizes.
It looks as if your SHashMap fits our use case perfectly.

Would you recommend using this in production yet?

1 Like

Hey, great to hear that!

Yea, hashmap would work, but if this collection is infinite, a btreemap should be better in long run.

Unfortunately, not yet. It needs a bit more time for stabilization until I could recommend it for production.

1 Like

I’m guessing you recommend BTreeMap because HashMap has to rehash all of the values every so often as items are inserted? Or is there another reason?

Yes, and it is also limited to maximum of 2/3 * 2**32 elements. So, if you have a plan to use all the 32GBs of stable memory, this collection wouldn’t help you.

Hashmaps in general are good when you know how many elements it will have. If this number is unbound, then a BTreeMap should be a better alternative.

2 Likes

I think forcing people to use speedy isn’t ideal.
You could instead expose a trait with to_bytes (&T -> &[u8]) and from_bytes (&[u8] -> T)
People can then implement the trait however they want.
For example in my use case both my keys and values are already bytes.
In the current version there would be an unnecessary copy of the bytes here.

1 Like

Yea, I agree with you. Gonna reiterate on that one day.

2 Likes

Oh and by the way, once the OpenChat SNS is launched, there will be a bucket of CHAT tokens allocated for dev bounties.

So if you get this lib ready for production so that we can use it we’ll definitely send you a nice chunk of CHAT! :grinning:

6 Likes

Super excited by this !
I can’t wait to use it in production here at Distrikt !

2 Likes

Hi there, any updates? Or can we use this, ic stable memory, in production now?