Hey everyone,
Stable structures is a Rust library built by DFINITY that allows canister developers to store tens of GiBs of data without the need for upgrade hooks.
We’ve received feedback from users that having to put an upper bound on the size of whatever can be stored in a StableBTreeMap
makes the library difficult to use, and in many cases is a blocker for adoption altogether. This limitation is also shared by other stable structures, like StableVec
.
Currently, to store a type in a StableBTreeMap
, the developer needs to implement both the Storable and BoundedStorable traits:
pub trait Storable {
fn to_bytes(&self) -> Cow<[u8]>;
fn from_bytes(bytes: Cow<[u8]>) -> Self;
}
pub trait BoundedStorable: Storable {
const MAX_SIZE: u32;
const IS_FIXED_SIZE: bool;
}
Implementing the BoundedStorable
trait requires specifying a MAX_SIZE
- an upper bound on the number of bytes this type can use when serialized. This was initially introduced because it simplifies memory management.
To improve the usability of this library, we’re now working on removing the requirement of having types be bounded. We’ll still allow developers to specify bounds, as that can be useful in making memory optimizations internally, but it would be optional.
There are two solutions we have in mind:
Solution #1: Introduce “Unbounded” types
- The
BoundedStorable
andStorable
traits are kept as-is. - Rust doesn’t support specializations based on generics, so to support unbounded types, we’d need to introduce new stable structures:
UnboundedStableBTreeMap
,UnboundedStableVec
, etc. - These unbounded types, because of how Rust generics work, will not have access to the bounds of the types if they exist, and can therefore not make optimizations based on that information. For example, if the structure can know that the type is a
u8
(a small, fixed-size type), it can store it in a way that’s more efficient than an unbounded type, but it won’t be able to make that optimization with this approach as it can’t have access to this information.
Solution #2: Merge the BoundedStorable
trait into the Storable
trait
- This solution works around Rust’s limitations with generic specializations and is implemented in this PR.
- The
BoundedStorable
trait is removed, and theStorable
trait is extended to include aBOUND
field.
// ----------- Before -------------
pub trait Storable {
fn to_bytes(&self) -> Cow<[u8]>;
fn from_bytes(bytes: Cow<[u8]>) -> Self;
}
pub trait BoundedStorable: Storable {
const MAX_SIZE: u32;
const IS_FIXED_SIZE: bool;
}
// ----------- After -------------
pub trait Storable {
fn to_bytes(&self) -> Cow<[u8]>;
fn from_bytes(bytes: Cow<[u8]>) -> Self;
/// The size bounds the type.
const BOUND: Bound;
}
/// States whether the type's size is bounded or unbounded.
pub enum Bound {
/// The type has no size bounds.
Unbounded,
/// The type has size bounds.
Bounded {
/// The maximum size, in bytes, of the type when serialized.
max_size: u32,
/// True if all the values of this type have fixed-width encoding.
/// Some data structures, such as stable vector, can take
/// advantage of fixed size to avoid storing an explicit entry
/// size.
///
/// Examples: little-/big-endian encoding of u16/u32/u64, tuples
/// and arrays of fixed-size types.
is_fixed_size: bool,
},
}
- Stable structures such as
StableBTreeMap
can then be modified to support both bounded and unbounded types, and performance optimizations can be made in case types are bounded. - The downside of this approach is that checking whether or not a bound exists now happens at run-time. Example:
- If we have a stable structure that only supports bounded types, and you try to store an unbounded type, the code will compile, and only at run-time would you get an error that the type you’re trying to store is unbounded.
- With solution #1, such errors would be detected at compile-time, since you’d need to implement
BoundedStorable
for that type.
I’d like to get your feedback on which approach you’d prefer and/or if you have other solutions in mind.