Stable Structures: Removing the Bounded Size Requirement

Hey everyone,

Stable structures is a Rust library built by DFINITY that allows canister developers to store tens of GiBs of data without the need for upgrade hooks.

We’ve received feedback from users that having to put an upper bound on the size of whatever can be stored in a StableBTreeMap makes the library difficult to use, and in many cases is a blocker for adoption altogether. This limitation is also shared by other stable structures, like StableVec.

Currently, to store a type in a StableBTreeMap, the developer needs to implement both the Storable and BoundedStorable traits:

pub trait Storable {
    fn to_bytes(&self) -> Cow<[u8]>;
    fn from_bytes(bytes: Cow<[u8]>) -> Self;
}

pub trait BoundedStorable: Storable {
    const MAX_SIZE: u32;
    const IS_FIXED_SIZE: bool;
}

Implementing the BoundedStorable trait requires specifying a MAX_SIZE - an upper bound on the number of bytes this type can use when serialized. This was initially introduced because it simplifies memory management.

To improve the usability of this library, we’re now working on removing the requirement of having types be bounded. We’ll still allow developers to specify bounds, as that can be useful in making memory optimizations internally, but it would be optional.

There are two solutions we have in mind:

Solution #1: Introduce “Unbounded” types

  • The BoundedStorable and Storable traits are kept as-is.
  • Rust doesn’t support specializations based on generics, so to support unbounded types, we’d need to introduce new stable structures: UnboundedStableBTreeMap, UnboundedStableVec, etc.
  • These unbounded types, because of how Rust generics work, will not have access to the bounds of the types if they exist, and can therefore not make optimizations based on that information. For example, if the structure can know that the type is a u8 (a small, fixed-size type), it can store it in a way that’s more efficient than an unbounded type, but it won’t be able to make that optimization with this approach as it can’t have access to this information.

Solution #2: Merge the BoundedStorable trait into the Storable trait

  • This solution works around Rust’s limitations with generic specializations and is implemented in this PR.
  • The BoundedStorable trait is removed, and the Storable trait is extended to include a BOUND field.
// ----------- Before -------------
pub trait Storable {
    fn to_bytes(&self) -> Cow<[u8]>;
    fn from_bytes(bytes: Cow<[u8]>) -> Self;
}

pub trait BoundedStorable: Storable {
    const MAX_SIZE: u32;
    const IS_FIXED_SIZE: bool;
}

// ----------- After -------------
pub trait Storable {
    fn to_bytes(&self) -> Cow<[u8]>;
    fn from_bytes(bytes: Cow<[u8]>) -> Self;

    /// The size bounds the type.
    const BOUND: Bound;
}

/// States whether the type's size is bounded or unbounded.
pub enum Bound {
    /// The type has no size bounds.
    Unbounded,

    /// The type has size bounds.
    Bounded {
        /// The maximum size, in bytes, of the type when serialized.
        max_size: u32,

        /// True if all the values of this type have fixed-width encoding.
        /// Some data structures, such as stable vector, can take
        /// advantage of fixed size to avoid storing an explicit entry
        /// size.
        ///
        /// Examples: little-/big-endian encoding of u16/u32/u64, tuples
        /// and arrays of fixed-size types.
        is_fixed_size: bool,
    },
}
  • Stable structures such as StableBTreeMap can then be modified to support both bounded and unbounded types, and performance optimizations can be made in case types are bounded.
  • The downside of this approach is that checking whether or not a bound exists now happens at run-time. Example:
    • If we have a stable structure that only supports bounded types, and you try to store an unbounded type, the code will compile, and only at run-time would you get an error that the type you’re trying to store is unbounded.
    • With solution #1, such errors would be detected at compile-time, since you’d need to implement BoundedStorable for that type.

I’d like to get your feedback on which approach you’d prefer and/or if you have other solutions in mind.

17 Likes

Hey, Islam!

In ic-stable-memory we have a SBox smart-pointer. This pointer is fixed in size and it points to a location on stable memory where some data lies.

You can pass any data, that can be serialized, into an SBox like this:

let s = String::from("Hello, World!");

// boxed_s is now allocated on stable memory
let mut boxed_s = SBox::new(s).unwrap(); 

// when a value of SBox is mutated, it may reallocate the data, so it still fits
boxed_s
  .with(|it| {
    it.push("a long string to force it to reallocate");
  })
  .unwrap();

This smart-pointer makes it possible to store unbound data inside any stable collection (all of which are fixed size by default), by introducing an extra layer of indirection.

// stores a mapping of stable mem allocated strings to other stable mem allocated strings
let example = SHashMap::<SBox<String>, SBox<String>>::new();

I believe, you could also do something like this in stable-structures. Seems like this solution won’t require a lot of change. And it seems like the most rusty way of doing things like that.

Hope you’ll find it useful!

3 Likes

This is how Dragginz currently implements the trait :

image

We don’t really care about the compiler time checks as it’s all validated within our schema anyway.

#2 would make life easier for us, but only marginally.

Thanks for all your work on this, it has been a little frustrating trying to code around the max size.

3 Likes

I plan to introduce the stable structures in Juno (PR here). So, I am part of the users who’s having to put an upper bound on the size is an issue because I cannot know what will the developers effectivelly store in it.

I’m far away of being a Rust expert, so whatever the solution, I will be super happy. However, based on what I understand, I also like the second solution.

2 Likes

With just these two choices I lean towards the second solution as well.

@senior.joinu
does these 2 library serves same feature (allowing access to stable memory)?
if yes: what’s the difference between this and ic-stable-memory?

Thank you all for your feedback. It seems that there’s a slight (but not very strong) preference towards the second solution.

@senior.joinu Thanks for sharing! I remember looking at the Box idea a while back. I can’t remember exactly why I didn’t pursue the idea further, but I’ll take a second look. One difference in our architectures is that ic-stable-structures does not have a global memory allocator - each structure manages its memory separately to reduce the blast radius of any bugs, so we’ll need to take that into account.

@borovan IIUC, in your code you’re setting a max size of 2KiB, correct? A large bound like that does give you a lot of wiggle room for you to not worry about bounds, but it also comes at the expense of high memory usage. Has that not been an issue for you?

@pramitgaha These two libraries do share the same goals of allowing you to store data directly to stable memory and scale your canister to 10s of GiBs. There are different trade-offs between the two in terms of speed, safety, architecture, etc, so I suggest you read up on both of the libraries and choose the one that serves you best.

No we’re just using a small subset of test data right now. We’ll just switch over to the new dynamic size when it’s ready.

Looks like solution #1 is a pure benefit with the compile-time checks that unbounded types are being stored in unbounded stable structures. If the type is a u8 it can be used within a bounded stable structure type.

Is there any benefit to #2?

#2 will be better.
Developers can decide whether to specify the size according to their needs. Especially in some scenarios, the data grows dynamically, and the size cannot be specified at all. #2 is well adapted to the needs of this scenario

1 Like

At Bitfinity, we’ve had unbounded size key value stable structures for a while now. Please do check it out and let us know your thoughts.

https://github.com/bitfinity-network/canister-sdk/blob/main/ic-stable-structures/src/storage/structures/unbounded.rs

1 Like
  1. Any update on this? Will this be merged chore: merge `BoundedStorable` into `Storable` by ielashi · Pull Request #94 · dfinity/stable-structures · GitHub?
  2. With the new change above, if I change my code to use unbounded from bounded, will the structure that was stored as bounded be serialized correctly?
1 Like

@Maxfinity I think that’s a very reasonable approach, and we would’ve likely considered the same approach within stable-structures if the only requirement was to support unbounded values. We’d like to also support unbounded keys, and hence the additional complexity.

@tcpim Yes, the community feedback has heavily favored this solution, so that will be merged. There are also additional changes that are being worked in in the BTreeMap itself to support unbounded values. We plan to release a beta version of this work this month.

Yes, the structure will continue to be serialized correctly.

4 Likes

Hi @ielashi , when will this feature be released in prod? It seems the current latest version doesn’t have this https://crates.io/crates/ic-stable-structures

2 Likes

Timely question :slight_smile: I just submitted a PR that adds support for the new BTreeMap with unbounded types. There are still performance optimizations to be made, but once this is merged we have enough to do a beta release and get feedback from the community.

5 Likes

Any thoughts on how soon you think the Bitcoin canister or II starts using this upcoming version?

I would definitely feel much more confident adopting this knowing it’s being validated by other large use cases built by Dfinity engineers

chill dude, islam’s got this

2 Likes

For what it’s worth, I’ll be using the upcoming version in Juno.

2 Likes

That’s a very good question. Before this version is released I do plan to test this out on the Bitcoin state, at least by computing the Bitcoin canister’s state offline. I think we’d want to upgrade the Bitcoin canister to use this version in production as well, but I don’t think we can communicate a timeline for that at the moment.

2 Likes

Hey everyone, we just released a beta version of stable-structures, version 0.6.0-beta.0, that contains the new V2 BTreeMap along with the API changes proposed in the initial post of this thread.

You can see an example of using the V2 BTreeMap here. Note that in this release the developer needs to explicitly request using the V2 BTreeMap by using BTreeMap::init_v2. We currently do not support migrating a V1 BTreeMap to V2, but in the real release this will be supported, so the migration then can be done seamlessly.

As this is still a beta release, we recommend against using this in production and we may make breaking changes as needed. Your feedback would be very much appreciated :slight_smile:

cc @peterparker @hpeebles @lastmjs who expressed interest in this.

5 Likes