Ic-stable-memory rust library

Yes, and it is also limited to maximum of 2/3 * 2**32 elements. So, if you have a plan to use all the 32GBs of stable memory, this collection wouldn’t help you.

Hashmaps in general are good when you know how many elements it will have. If this number is unbound, then a BTreeMap should be a better alternative.

2 Likes

I think forcing people to use speedy isn’t ideal.
You could instead expose a trait with to_bytes (&T -> &[u8]) and from_bytes (&[u8] -> T)
People can then implement the trait however they want.
For example in my use case both my keys and values are already bytes.
In the current version there would be an unnecessary copy of the bytes here.

1 Like

Yea, I agree with you. Gonna reiterate on that one day.

2 Likes

Oh and by the way, once the OpenChat SNS is launched, there will be a bucket of CHAT tokens allocated for dev bounties.

So if you get this lib ready for production so that we can use it we’ll definitely send you a nice chunk of CHAT! :grinning:

6 Likes

Super excited by this !
I can’t wait to use it in production here at Distrikt !

2 Likes

Hi there, any updates? Or can we use this, ic stable memory, in production now?

I’m finishing the current release at the moment.
Planning to release in a week or two.

A lot of exiting news, please be patient :slight_smile:

3 Likes

Oh man, I did not want to pressure you, but I’m super excited by this !!

1 Like

I’m glad to hear that, when the new version comes out, I can use it in my own project :grinning:

1 Like

Hi there, are there any updates on your project? :grinning:

Hello there!
Yes, I think I would need another week to release.

Found a way to finally make the library completely sound - so had to waste some time with the implementation.

3 Likes

Please take the time you need, this is a very important building block so it needs to have the best quality that it can get.
And quality takes time.

2 Likes

Hi, I’m here again, are there any updates this week? :grinning:

2 Likes

Hey there!
Sorry for taking so long and not standing with my own words.

Once I’ve implemented this new complete-soundness-feature, it revealed a lot of problems I was previously unaware. Now these problems are successfully solved.

But there are still a couple of things I want to do, before releasing:

  1. New certified data structure received a lot of testing already, but I want to double check on it by running it on the main-net.
  2. I want to update the documentation to reflect latest changes and to provide more in-depth view of the library.

Hope to finish with this soon! Stay tuned.

And thank you all for your support. Knowing that there are people who really want to use something you’ve made is an unbelievable motivation!

5 Likes

It will never be perfect, but now it’s good :slight_smile:

:exclamation: ic-stable-memory library has been promoted to v0.4.1 :exclamation:

This is the first release that I consider safe to use. A lot of work was done to ensure it works properly and the code is sound. I’ve spent a lot of time trying to eliminate any possibility of memory leaks or unexpected panics. In fact, I’m so sure about it, that I’m going to start a new project, which is a dapp that will use this library to store everything in stable memory.

The library has changed a lot, so let me walk you through the update.

1. Enforced Rust’s ownership rules

This is the set of changes, that I come up with very recently and nobody was aware I’m doing it, but it is so cool, that I simply can’t start from anything but this.

If you remember, previously each data structure in ic-stable-memory had a special stable_drop() method, that was used to manually release stable memory, when it no longer needed. In other words, stable collections worked like they were written in C/C++ and not in Rust. This was a serious source of errors for both: library users and me. In fact, the only one team that was using ic-stable-memory previously lost all of their data because of a memory leak.

This is no longer an issue though. I’ve managed to “connect” stable-allocated values with Rust’s borrower, so now they follow the same ownership rules as regular values. Let’s see an example:

{
  let mut vec = SVec::<u64>::new();

  for i in 0..100 {
    vec.push(i).expect("out of memory"); // <- more on this below
  }
} // <- `vec` gets stable-dropped, releasing all the memory automatically

Another example:

thread_local! {
  static STATE: RefCell<Option<SVec<SBox<String>>>> = RefCell::default();
}

#[update]
fn push_string(str: String) {
  STATE.with(|s| {
    let boxed_str = SBox::new(str).expect("out of memory"); 
    
    s
      .borrow_mut()
      .as_mut()
      .unwrap()
      .push(boxed_str)          // <- `boxed_str` will NOT get stable-dropped, 
      .expect("out of memory"); // because it is now owned by STATE
  });
}

I know this may not sound as impressive as I’m trying to convince you it is, but this is actually a game-changer in my opinion.

This new set of rules is enforced on runtime. Yes, you lose a tiny bit of performance, but in exchange you can’t leak stable memory anymore. This is it. The library now provides a guarantee, that while you’re using only those functions which are not unsafe, you can’t leak memory.

Also, have you noticed in the example above, how you could easily swap SVec with simple Vec and it would continue to be a valid program? This is what following the rules also gives for free - the library now looks and feels like something from Rust’s ecosystem. It is now a continuation of the language and not an appendix.

In order to contribute even more into this vibe I also introduced a notion of stable reference. Each stable collection now returns a smart-pointer to the data it owns, instead of a copy of that data.

See this example:

let mut map_of_stuff = SBTreeMap::<u64, u64>::new();

let value: SRefMut<u64> = map_of_stuff.get_mut(&10).unwrap();

*value = 15; // <- this is a valid code

These stable references are lifetime-aware, so it is no longer possible to concurrently mutate the same
data structure twice - the compiler simply won’t let that happen.

Also, I removed stable variables from the library. Now you don’t need to use this ugly s! {} syntax anymore, you can simply use the same techniques you use with std collections, like using thread_local!.

All together these changes make it feel like you’re just using common data structures, common Rust, despite the fact that there is a whole custom memory allocator works under the hood and the data is actually not in the stack or not in the heap, but in some other place that only has read_bytes and write_bytes methods to it.

2. New stable memory allocator

By the way, about the allocator. I rewrote it completely from scratch to make it more flexible and efficient.

The old one was O(n) for allocations and O(1) for deallocations, where n is the amount of free blocks in the free-list. The new one is O(logn) for both: allocations and deallocations. This should help with the performance of SBox-ed values. Also, this new allocator can allocate stable memory blocks up to u64::MAX bytes long, which is important for some optimizations.

3. Ready for horizontal scaling

Two previous changes allowed me to re-introduce programmatic handling of OutOfMemory errors. Now, any method of the API, that potentially can allocate stable memory returns Result, where Err variant means, that your subnet is out of stable memory and you have to do something about it.

This feature was in the library from day 0 (and it is actually one of the main reasons, why I wanted to build it), but was removed, when it became clear, that it is impossible to use in complex projects. This is because, when you catch this error and don’t panic, you have to reset the changes that you already did to your canister manually. This was very hard, mainly because of stable_drop() that you have to call on each thing that you’ve created so far. But now, when values stable-drop themselves automatically, this is no longer an issue.

This means, that you can actually program your canister to automatically scale horizontally, when it occupies all available stable memory!

  STATE.with(|s| {
    let boxed_str = SBox::new(str).expect("out of memory"); 
    
    let res = s
      .borrow_mut()
      .as_mut()
      .unwrap()
      .push(boxed_str);

    if res.is_err() {
      scale_horizontally().await;  // <- like this, boxed_str will stable-drop automatically
    }
  });

4. Stable certified map

ic-stable-memory now also provides a certified stable collection SCertifiedBTreeMap. This is a Merkle tree built on top of a B+ tree. What’s more important is that its proofs are compatible with proofs you get from from Dfinity’s RBTree.

I’ve prepared a demo project that is certified assets canister fork, but using only data structures from ic-stable-memory. You can find it here and play with it yourself. I tried it on main-net and was able to render frontend with valid certificates and stuff.

It supports the following proofs:

  • set (in this case - map) membership proof
  • absence proof
  • range proof

Also it is written in such a way, so map mutation and Merkle-tree recalculation are separated. This allows doing multiple update operations in batch, but recompute the underlying Merkle tree only once, which can greatly increase the performance in some cases.

By the way, here are benchmarking metrics in a real canister using performance counter API comparing to the RBTree:

Insert 5_000 entries

rbtree -> `5627092211`
scertifiedbtreemap -> `9108725043` - x1.6 slower
scertifiedbtreemap (in batches of 10) -> `1354608056` - x4.1 faster

Witness 5_000 entries

rbtree -> `3273570622`
scertifiedbtreemap -> `3541619761` - x1.08 slower

Remove 5_000 entries

rbtree -> `9359364040`
scertifiedbtreemap -> `6693095737` - x1.4 faster
scertifiedbtreemap (in batches of 10) -> `731156025` - x12.8 faster

As we can see, batching is totally killing it.

5. Stable data structures tweaks

A lot of work was done to improve stability and performance of existing data structures. You can find actual performance benchmarks here. Here are some short outlines:

  • Completely reworked SBTreeMap. It is now only x2-x3 times slower that the std’s one.
  • SHashMap now uses zwohash and performs deletions without tombstones, which make it keep the performance in shape and not degrade over time. Also, it is now dangerously close in performance with the std’s HashMap.
  • Added new SLog collection, which is an infinite analog to SVec very well optimized for the most recent entries. Should be very handy for storing logs and histories.
  • Removed SBinaryHeap, since SBTreeMap is now simply way faster than my previous binary heap implementation was and I’m out of capacity at the moment to keep it up with the rest of collections.
  • All keyed collections (maps and sets) are now accepting Borrow-ed keys for searching. For example, if you have a map like this SHashMap<SBox<String>>, u64> you can search it simply by &String and not by &SBox<String>.

6. Goodbye Speedy

Yes, Speedy is no longer used. And you’re not longer forced to use any serialization engine. In fact, the library is now uses two types of serialization. You can find out more in this article.

7. Documentation

Yes, the library is finally covered with documentation. There is a complete API documentation on docs.rs and also there are a bunch of tutorials on some important topics like upgradability or horizontal scaling or performance. You can find all of them in the Github repo.

8. Tests

I never in my life wrote so many tests. There are currently around ~120 tests, many of which are randomized. Some of them are completely fuzzy, randomizing even what kind of action they would do with the data structure next. The whole test suite runs about 20-25 minutes on my machine.

In other words, I did my best, guys. I really want it to be as safe to use as possible. But I also want it to be clear, that I’m the only one who tested it and I’m just a human and bugs may appear anyway. Please let me know, if you’ll find one.

What’s next

As I said earlier, I consider this library as a finished product that can be used as-is. It doesn’t mean, that I won’t add anything new to it or fix bugs - I will. But from now on my work on this library will be purely reactive - if you need something, let me know and together we’ll make it happen. If you have questions or want me to create a tutorial on how to handle a particular situation - let me know and we’ll figure it out together.

In other words, in order to improve this library, I now really need your feedback, guys.

Hope you’re having a great day!
Reach me out here, in this forum thread, or on Github issues.

@domwoe @dymayday @saikatdas0790 @cymqqqq @hpeebles

11 Likes

Hi, I’m very excited to hear the amazing news!!
And I’m sorry to bother you. :grinning:
Today I haven’t read your source code completely, and I see this demo code:
#[init]
fn init() {
stable_memory_init();

STATE.with(|s| {
*s.borrow_mut() = Some(SVec::new());
});
}

#[pre_upgrade]
fn pre_upgrade() {
let state: State = STATE.with(|s| s.borrow_mut().take().unwrap());
let boxed_state = SBox::new(state).expect(“Out of memory”);

store_custom_data(0, boxed_state);

stable_memory_pre_upgrade().expect(“Out of memory”);
}

#[post_upgrade]
fn post_upgrade() {
stable_memory_post_upgrade();

let state = retrieve_custom_data::(0).unwrap().into_inner();
STATE.with(|s| {
*s.borrow_mut() = Some(state);
});
}
I have a question about pre_upgrade, post_upgrade here: do you mean it can support automatically recycling memory now? like motoko, it can automatically recycling memory and no need to manually recycle memory after upgrading canister.

1 Like

Hey there!

I’m not quite sure, to what do you mean by saying

These functions: store_custom_data and retrieve_custom_data change nothing - the data was stored in stable memory all the time. What you do with these functions is you store pointers to this data in a known place in stable memory, so you could find this data after the upgrade is done.

And yes, with ic-stable-memory you no longer need to manually recycle any memory. Everything will get recycled automatically both: heap and stable memory, but only when the value leaves the scope.

The idea is that for an observer any code written with ic-stable-memory now looks and feels like a code with regular data structures which store data on heap. But under the hood all the data is stored on stable memory.

For you there is no difference now, whether you use Vec or SVec, HashMap of SHashMap - the API is the same, ownership rules are the same.

In this particular example, that you’ve mentioned, actually nothing leaves the scope. In pre_upgrade function you take your state (the part of it which lives on stack and not in stable memory) from a static variable and move it to the custom data persistent storage, which now owns this data. Then you call stable_memory_pre_upgrade and persist the allocator itself in a known location also. Nothing gets released.
Then in post_upgrade you restore the allocator by using stable_memory_post_upgrade (under the hood the allocator will put itself into the same thread_local! static variable), then you retrieve your state from custom data persistent storage and put it back into your own thread_local! static variable. Nothing gets realeased, nothing leaves the scope. The data that was in your state never leaves the stable memory and doesn’t move inside it. Only pointers move back and forth.

Hope I made it clear.

1 Like

Thx for your answer :grinning:
I am ready to dive into your code and maybe have some questions to ask you.
And I plan to develop a dapp based on your project in the future weeks.

1 Like

Also, everyone, please notice that old tutorials (from papy.rs) are no longer up-to-date and shouldn’t be relied on.

Tried to remove links from my previous posts, but the forum engine doesn’t let me do it.

I think this is fantastic

2 Likes