Motoko stable memory in 2022

Just a suggestion to not give up on the dream of orthogonal persistence, where developers can store very large amounts of data in data structures existing in the same memory their code runs in without hassle. I hope we can work towards the linear memory or heap being increased greatly and abstracting away the concept of stable memory. I don’t want to have to think about all of that as a developer (even a developer of libraries).

Is there an in-depth explanation of how memory is working on the IC? I don’t quite understand all of the limitations and how/why stable memory is necessary, the serialization required, etc. I’d love to dig in more to try and come up with solutions.

9 Likes

Fully agree with this. I recently wrote about my thoughts in designing the data schema for tipjar. Essentially I used mutually recursive data types instead of normalized tables like in a database.

Basically I don’t want to keep upgrading my canisters. In almost all other blockchains, smart contracts are designed to be immutable, and people seem to be fine with that. So give me linear memory (that can grow as needed) and I’d be fine not having to think about stable memory & upgrades.

7 Likes

@PaulLiu, if you never want to upgrade, then you indeed don’t have to bother with stable memory, even now.

1 Like

I totally agree with this sentiment. Orthogonal persistence was one of the main things that attracted me to the IC (from a purely programming perspective, not even blockchain related).

Fully agree with this. I recently wrote about my thoughts in designing the data schema for tipjar. Essentially I used mutually recursive data types instead of normalized tables like in a database.

Interesting, thanks for sharing. By “direct object references”, do you just mean nested objects? Like a User record directly contains an array of Allocation records? I wasn’t aware that could be mutually recursive.

Do you mean this would work?

type A = {
    b : B;
};

type B = {
    a : A;
};

(I would try this out but I’m on a phone right now.)

2 Likes

Just to follow up on this, I believe I will hit memory limits using Motoko stable variables very soon. There are workarounds, but I wanted to ask…

What is your plan with ExperimentalStableMemory? Can I build on it? Will the interface change in the near future?

5 Likes

(@Manu asked someone to follow up on this)

The library is marked experimental because its rather easy to shoot yourself in the foot. In particular, without coordination, separate libraries that import ExperimentalStableMemory can easily wind up trashing each others’ memory.

That said, I you can build on this if you are careful. The library will at most be replaced by something roughly similar but with better isolation guarantees.

Regarding the recent extension of stable memory limits from 8GB to 32GB:

The library already use 64-bit addresses so supports the 32GB stable memory limit out of the box.
However, in order to do that, you do need to tell the compiler how many stable memory pages (at most) to dedicate to ExperimentalStableMemory.mo using the --max-stable-pages <n> compiler flag. (With dfx, you can set this with the optional “args” string property of a motoko canister in the dfx.json file - this contains additional command line arguments to pass to the moc compiler during a build.)

By default, the compiler allows at most 65536 (64K) pages (4GB), reserving the remainder (previously 4GB, but now 28GB) for Motoko stable variable storage.

From https://github.com/dfinity/motoko-base/blob/master/src/ExperimentalStableMemory.mo:

Memory is allocated, using* grow(pages) *, sequentially and on demand, in units of 64KiB pages, starting with 0 allocated pages. New pages are zero initialized. Growth is capped by a soft limit on page count controlled by compile-time flag --max-stable-pages <n> (the default is 65536, or 4GiB).

NB: The IC’s actual stable memory size (ic0.stable_size) may exceed the page size reported by Motoko function* size() . This (and the cap on growth) are to accommodate Motoko’s stable variables. Applications that plan to use Motoko stable variables sparingly or not at all can increase --max-stable-pages as desired, approaching the IC maximum (currently 8GiB). All applications should reserve at least one page for stable variable data, even when no stable variables are used.

A tutorial sample using ExperimentalStableMemory is here (thought it doesn’t mention the compiler flag):

(An example on passing (other) command line args to moc via dfx.json is here Stable-types build error moving to 0.9.3 and later versions of the SDK - #35 by claudio)

9 Likes

@claudio Thanks for making this post! It’s very helpful.
I have a question about setting the --max-stable-pages of canisters created dynamically (from a parent canister). Do they take the max pages from their parent? Or is there a different way to set them

Do you mean do the instances of an imported actor class receive the same --max-stable-pages setting as the importing actor.

Yes, I think that will be the case.

1 Like

Yes, that’s what I mean, creating an instance of another canister.
Thank you!

Is there any hope of the stable variable in motoko being tied to the stable memory we are talking about here? I’m guessing the stable variable use case is still limited to the 8GB heap(and now we can use most of it because of the better streaming upgrade code)?

2 Likes

Stable variables reside in the 4GB main heap in flight, and get copied to/from stable memory only during upgrade, so the 4GB limit still applies, I’m afraid.

You can now access up to 32GB (was 8GB) using stable memory, though you should reserve up to 4GB for any stable variables you may use.

I wonder if it’s possible to make stable variables directly store their bytes in stable memory, given the new performance improvements in using the System API.

That way, memory type (i.e. heap, stable memory) is completely transparent to the developer.

1 Like

The overhead of storing stable data in stable memory is not just because of stable memory itself being more costly, it also is because a stable data layout is necessarily less efficient and flexible. In particular, storing something in a stable variable would then require not just moving a pointer, but a deep copy of the entire data each time, which can have extremely intransparent cost.

Moreover, stable data layout can never be changed, so, it would never be possible to tune memory layout, GC, etc. for it.

So, while it is possible in principle to store stable data in stable memory, in practice it would likely still be many times more costly, with no chance of future improvement.

1 Like

Dumb question, but why can stable data layout not be changed? And why can’t pointers be used?

I thought each canister had their own stable memory. Or is it because the heap can be wiped during canister upgrade but stable memory cannot? I think I’m missing something.

The pointer cannot be used when it points into non-stable memory, because then the data wouldn’t be stable.

The layout cannot be changed because the whole point of stable memory is that it survives upgrades and remains usable afterwards. Since the upgraded code may have been compiled with an arbitrary future version of the Motoko compiler, that arbitrary future version needs to remain compatible with the old stable memory layout.

3 Likes

And we are still waiting on wasm to add 64 bit support so the heap can grow to more than 4GB right? Any movement on that front?

5 Likes

Quick question. Let’s say I have a canister with 2GB of user data. I change the data type and now have to copy all the data in postupgrade in order to upgrade the canister without wiping the data. Would I be able to upgrade? What if I have 3GB of user data?

1 Like

As someone who is writing documentation for Motoko, I would love to know how to present the use of stable vars and stable memory to programmers.

Right now it isn’t clear what best practices should be.

One idea I had while tinkering with a stable memory lib for my project, is to implement some kind of ‘Memory Boot Record’ in stable memory and initiate code from there.

I’m not sure

1 Like

It is a great question. The “stable” keyword is currently confusing as it only has tangential relevance to “stable memory”. My understanding is that @matthewhammer is working on generalizing “stable memory” in a similar manner to “stable”, including garbage collection. I’m not sure how this will affect the vocabulary, but it is unlikely we go backward. “stable” should probably have been “managed” or some something else. Perhaps we get a breaking change at some point.

4 Likes

The MVP will only be a small generalization over what is now called ExperimentalStableMemory, giving independent, dynamically allocatable Regions (there is now only one global stable memory region that can be grown, like a pre-allocated, shared Region in the new terminology).

In particular, the new API will still be:

  • low level, with operations like grow and loadBlob and storeBlob taking offsets into a Region’s big array of bytes.
  • compatible with stable vars, another more “high level” way to represent stable data that is orthogonal, but uses the same IC-level machinery under the hood (namely, stable memory space).

Eventually, we want the Region type to be collected by GC, but that’s not part of the MVP.

In terms of the larger story about stable data in Motoko, there’s expectations that eventually, the IC execution layer will generalize stable memory so that it can be as inexpensive and plentiful as ordinary canister memory is today.

If and when that shift happens, lots of things will likely change in the Motoko GC story, and the Rust stable memory story too. I don’t have a timeline for that shift, but it seems inevitable given enough time for things to evolve, as it seems like the most faithful way to realize the “orthogonal persistence” promise. cc @claudio @luc-blaeser

6 Likes