Increased Canister Smart Contract Memory

stable64_grow will now let you increase stable memory up to 32 GiB

Please consider adding an API to retrieve an amount of free memory left in a subnet. 32GiBs is the soft limit, but if your subnet runs out of physical space earlier, your canister can still end up with only some megabytes of available storage.

Something like stable64_pages_left() -> u64 should do the trick.

Horizontal scaling is the only way to build truly autonomous software, but now this approach is blocked because of missing API.

I know this may be not the right place for this proposal, but anyway.

2 Likes

Yea, ic-stable-memory is in the middle of a huge update one of which is a stable collection for data certification. You can read more about this collection that I want to implement here.

Once this new collection is implemented it will be possible to implement such a stable memory based asset canister - everything we need in order to compose it will be available as a library.

3 Likes

Can you say more precisely how you’d want to use such an API? I think it sounds reasonable, but we also already have the memory allocation which you can be used to ensure your canister doesn’t run out of memory.

For example, you could start with a memory allocation of 1GB and increase it by 500MB each time your canister’s free memory becomes less than 500MB. If the subnet gets low on memory you’ll notice it because the call to increase the memory allocation will fail. This way you’ll still have some memory you can use when you notice that the subnet is running out, whereas with the stable64_pages_left API you might see 32GB available on one call, and then have it immediately go to 0 on the next call.

1 Like

The actual reasoning is the whole story, so here it goes.

This is all because of ic-stable-memory. In this library there is a stable memory allocator and a bunch of custom collections, like Vec or BTreeMap, which store data completely in stable memory.

The goal is to somehow have both:

  1. transactional execution - if there is not enough stable memory to complete message execution (we want to allocate two memory blocks during the call, we allocated one, but there is no memory for another), the state of the canister should reset to what it was at the beginning of this execution;
  2. horizontal scaling opportunity - when your canister is close to being out of memory, you should somehow be able to react to this situation and run some code (for example, when your canister sees, that there is only 10MBs left in the subnet, it may want to spin up a copy of itself on another subnet).

It turns out, that you can have any one of these easily, but not both. Transactional execution can be achieved by simply trapping when there is no more memory. Horizontal scaling can also be performed at the same exact moment. But you can’t both trap and run some code afterwards.

Initially I was thinking like: Ok, I'll just grow() stable memory on-demand and make all collections' methods transaction-like, so they would manually restore the state back to how it was before the failure. And also all methods would return an Error in that case, so developers could just react to that error and do something in order to automatically scale their app.
But, unfortunately, this solution only works for some simple use-cases and very much makes your code unreadable.
For example, it won’t work for BTreeMap, because it basically means, that you have (for each insert operation) allocate additional logN + 1 of nodes in advance (in order to see if there is enough memory) and if there is, you should somehow pass these newly allocated nodes inside your insertion code (which is very much recursive) in order to fill them with correct values and attach them to the tree. It is both complex and slow.
User code also becomes a mess, since you have to react to every Error returned by every collection, in order to reset all the previous operations you did during this transaction (and yes, everyone would have to manually do that).

Then I was thinking: Yea, this idea is bad, but what about if I will grow() some amount of stable memory in advance, to always keep it above some level and if I can't, I will execute some user-specified canister method like "on_low_stable_memory()"? I won't do any transaction-specific stuff inside collections - just trap and the state is safe.
This solution sounds good and pretty simple to implement (and I assume, this is what you propose), but it doesn’t work in practice.
For it to work in practice we have to make sure, that the level of grown stable memory that we keep is always bigger or equal than our maximum possible allocation during a single call. For example, if our canister has a single method that allocates exactly 1MB each time it is called, than we only need to make sure, that we have 1MB of stable memory in advance. In this scenario, after each such call we will allocate 1MB of grown stable memory and then grow() 1MB more. If we can’t grow() more - we just call on_low_stable_memory() hook and everything is good.

But real collections do not work like that. For example, let’s imagine a canister that stores a history of transactions (a ledger) in a vector. Vectors work in such a way, so when they reach their maximum capacity they try to allocate twice as much of memory to continue growing.
For example, we had a vector that had capacity of 10 elements; once we inserted the 11th element, this vector will reallocate into a new memory block that can now hold 20 elements.
This means, that in order for such a vector to work properly in our “grow-in-advance” setup, we always have to grow twice the size of this vector in advance. This means, that if our transaction history occupies 2GBs of stable memory, than we have to have 4GBs more of grown in advance memory (which is, by the way, completely unused, until you can’t grow more).

Ofc, you can imagine special collections that won’t reallocate that way and will work maybe a little slower, but only consuming a small portions of new memory in order to continue growing (and that is what I initially was going towards). But the main question stays the same: how much stable memory exactly do you have to allocate in advance in order to keep it cheap (to not pay much for unused storage) and fail-proof?

I’m building this library for almost half a year now (reinventing for myself all the uni’s CS program), and I don’t know how to answer this question.

So, my final though (and this is what I propose here) - let's completely decouple both these processes. Let's make transactions trap, when they reach memory limit and let's give a user some way of understanding, how likely it is for their canister to fail.
This concept of on_low_stable_memory() taught me one more interesting thing: If your canister can’t grow now, it doesn’t mean that it won’t be able to grow after a couple of minutes!. Memory is very flexible on IC. Some canisters decrease the total amount of available memory in a subnet, but some - increase (when destroyed).

So, providing a user with some kind of on_low_stable_memory() system hook is actually a bad idea, because you won’t be able to answer the question: “if the subnet again has enough memory to allocate (after some canisters died), should this hook be called once again, when there is no more memory again?”. Everyone would have a different answer to it.

So it is better to just not do that, but instead give everyone a tool to track the available memory and to react how they like. For example, if we had a stable64_pages_left() method, we could use heartbeat in order to achieve the same result as with on_low_stable_memory(), but with less effort and more freedom.


This is it.

P.S. Actually, as far as I understand, all of these points are also valid for common heap memory. You can easily run out of heap without even reaching 4GB’s, because of how the system works. Maybe, there should also be a method for that? Or maybe stable64_pages_left() will automatically resolve this issue also, because when serialized the heap and stable memory are the same kind of memory, so it doesn’t matter and if stable64_pages_left() shows 0, then you’re probably won’t be able to store more data on the heap also.

1 Like

This makes sense to me, but I’m not sure how much the proposed API would help with it because the returned value would only be reliable for the current execution. Taking your example, if a user calls stable64_pages_left during a heartbeat and it returns 32 GB, it’s still possible that a small stable64_grow call would fail on the very next message if some other canister took up the rest of the subnet’s memory in between the calls.

If a canister needs a reliable way of determining how much memory they have left then it would make more sense to reserve a memory allocation and check how the current memory usage compares to that allocation.

Yea, and that’s fine. That means that the next time this canister’s heartbeat will be invoked, stable64_pages_left() will return something close to 0 and the code can react to that by spawning a new canister and maybe by offloading new requests there.

This is exactly what I want. The state stays correct and the dapp can keep scaling.

Do you mean a flow like this?

  1. Inside your heartbeat function check, whether the canister has at least 500MBs of free memory.
  2. If it doesn’t, try to grow additional pages.
  3. If you can’t grow, execute some other code to scale.

Okay, now I get it. Yes, this is indeed the same, with an exception of you being forced to always pay for 500MBs more than you use. But, I believe that even 10MB will do the job for most cases.

UPD:
In any ways, and API like stable64_pages_left() is more suitable for this kind of tasks. Especially if it can hint an amount of heap memory left also for those who don’t use stable memory.

1 Like

The bump to 32 GiB stable memory is was approved in Proposal 86279:

  • Runtime: Increase stable memory to 32GB

and will be rolled out this week. So you can try it out as soon as your canister’s subnet has been updated to replica version cbf9a3bb. As a reminder, you can follow the proposals to update subnets on the dashboard.

11 Likes

Hi everybody!

We’ve done more testing with canisters with a lot of stable memory and have seen that we can actually increase the limit even further, so since release Proposal: 92573 - IC Dashboard canisters can now hold 48GiB of stable memory. Hope you will build cool canisters with lots of memory!

14 Likes

If I create a canister does it automatically book 48Gb of memory or is it filling it up as it goes? I’m thinking of DDOS cases against user-canister dApps. Sending a fleet of bots that create a canister with 48Gb of memory space?

No, not by default. You can reserve more space than you’re using with your canister’s settings (see dfx canister update-settings --help), but you’ll be charged for the reserved memory as if you were using it.

3 Likes

What’s the theoretical limit for a canister size?

–memory-allocation <MEMORY_ALLOCATION>
Specifies how much memory the canister is allowed to use in total. This should be a
value in the range [0…12 GiB]. A setting of 0 means the canister will have access to
memory on a “best-effort” basis: It will only be charged for the memory it uses, but
at any point in time may stop running if it tries to allocate more memory when there
isn’t space available on the subnet
This is what i can see, does that mean allocating 0, we can use canister memory beyond 4GB? or did i misunderstood something?

I’m no expert on this, but that matches how I read it.

1 Like

You can use more than 4GiB for the canister’s stable memory. The heap is still limited to 4GiB, that’s still limited by the fact that our current Wasm runtime supports 32-bit native memory (there’s work to allow for using 64-bit memory though).

2 Likes

And what should I do to upgrade canister to use more than 4 GiB stable memory? Same? --memory-allocation 0?

1 Like

You shouldn’t have to do anything really. By default, canisters use “best-effort” memory allocation when they are created (i.e. if you don’t specify anything with the memory-allocation option). As long as you use the 64-bit stable memory apis (should be available through both Motoko and Rust CDK) in your canister, you should be able to use more than 4GiB stable memory.

2 Likes

The State Manager runs on the Exexution Layer and the State Manager stores state on the SSD. I think it is very important to adopt 64-bit WASM to increase capacity. However, “How it works - Internet Computer Execution Layer” has “The replicated state that can be held by a single subnet is not bounded by the available RAM in the node machines, but rather by the available SSD storage.” written on it. In the short term, the method of increasing memory is a very good approach. However, I believe that SSD should be utilized because lower server costs lead to cheaper Cycle. There may be a way to store large amounts of data with not much memory. I would be glad to receive your opinion. Please let me know if my understanding is incorrect.

6 Likes

Using stable to modify variables in motoko, can 32G stable memory be used? From what version of ic is it supported?

The motoko devs should be able to answer this better, but my understanding is that currently motoko stable variables are actually held on the wasm heap and just transferred to stable memory during upgrades. This would mean that they cannot use the full stable memory size yet.

cc @claudio

hey everybody, a quick update on the stable memory limits: DFINITY will propose in the next replica version to further increase the stable memory limit to 64GiB (from the current 48GiB). Our testing shows that this will work without issue, and also in practice we’ve seen eg the bitcoin canister use > 40GiB of stable memory for some months now, and this has been running without problems, which makes us confident we can further increase.

An increase to 64GiB would also give the bitcoin canister more headroom to store the entire UTXO set, which has been growing rapidly in the past months.

12 Likes