Increased Canister Smart Contract Memory

Absolutely.

Coolio. https://github.com/dfinity/stable-structures/pull/83

I struggle with the MAX_SIZE variable. On one hand, my lack of passion for backend code and confidence makes me worry about calculating the wrong value (:sweat_smile:). On the other hand, as mentioned earlier, developers using my tool may upload unpredictable sizes due to the use of blobs. Additionally, Iā€™m uncertain about the consequences of setting a value now and realizing, after two months, that it should be increased.

In summary, the MAX_SIZE option is not necessarily a deal-breaker, but itā€™s definitely a big challenge for me which I would be happy to spare.

So, is it acceptable to set a random large value for anything, like 64 GB, or would this cause any issues?

1 Like

Dumb question, what do you mean with ā€œcomposite key approachā€? Like a key that contains multiple information?

Iā€™m interested by the question of handling vector dataset in StableBTreeMap because I would like to convert e.g. an Hashmap of BTreeMap.

// Basically
pub type Collection = BTreeMap<MyEntityKey, MyEntity>;
pub type State = HashMap<CollectionKey, Collection>;

I use such pattern because developpers can define their own keys - the canisters I provide is generic. Therefore as you pointed out, serialize/deserialize the entire HashMap wonā€™t be performant.

So if I get it right, your advice would be to flatten the above to a single StableBTreeMap in which the key contains basically both CollectionKey and MyEntityKey, correct?

Side note: or is it actually possible to create StableBTreeMap on the fly (at runtime)?

1 Like

I totally understand. Setting a reasonable MAX_SIZE is quite a significant design decision, and itā€™s not very obvious what value to set in many cases.

No. The BTree always allocates the maximum size of the value, so if you hypothetically put 64GB, youā€™ll run out of memory very quickly :slight_smile: The trade-off here is between memory usage and flexibility in the future.

Yes, exactly, and Iā€™d still recommend this pattern even when we remove the MAX_SIZE requirement from the BTreeMap in the near future.

Given the current MAX_SIZE limitation though, thereā€™s a trick where you can split your unbounded data into chunks. For example, you can have the following BTree:

StableBTreeMap<(CollectionKey, MyEntityKey, ChunkIndex), Blob<CHUNK_SIZE>>

In the above BTree, we split MyEntity (the unbounded value) into chunks of size CHUNK_SIZE, where CHUNK_SIZE is some reasonable value of your choice.

For illustrative purposes, letā€™s say you have a CHUNK_SIZE of 2, and youā€™d like to store the entities: (key_1, [1,2,3]) and (key_2, [4,5,6,7,8]).

In this case, the stable BTreeMap above will look like the following:

StableBTreeMap => {
  (collection, key_1, 0) => [1, 2],
  (collection, key_1, 1) => [3],
  (collection, key_2, 0) => [4, 5],
  (collection, key_2, 1) => [6, 7],
  (collection, key_2, 2) => [8],
}

In theory, yes, but in practice, not currently, because each StableBTreeMap requires its own virtual address space, so creating them does incur an overhead, so Iā€™d recommend against this approach.

2 Likes

Got it! Thank you for the detailed explanation, everything is clear now. Although Iā€™m confident that someone smarter than me could certainly solve my requirements using chunking as you suggested, as I donā€™t have any time constraints, Iā€™ll wait for the MAX_SIZE removal in the near future.

1 Like

Iā€™ve suggested this before, but I want to bring it up again. It would be nice if we could agree on a schedule for increasing the stable memory size, for example increasing it xGiB per quarter or per year based on some agreed-upon criteria. Right now it seems to just increase when DFINITY has a need for it to increase. Having this schedule would help us plan out future capabilities for our users of Azle, Kybra, and Sudograph.

1 Like

We also received requests from the community to increase the stable memory. We need to do thorough testing and make sure there are no technical issues before committing to a fixed schedule. Since it is a non-trivial amount of work, I think we can allocate time for it in July/August. Would that work for you or is it more urgent?

2 Likes

Itā€™s not urgent for us, we just want to push to get on a schedule for our future selves.

2 Likes

An update on that front: DFINITY plans to propose in an upcoming replica version to further increase stable memory from 64GiB to 96GiB. So far our biggest canister on the IC, the Bitcoin canister, has been pushing close to 64GiB, and the growth in Bitcoinā€™s UTXO set has been accelerating. With further testing we can see that we can support a larger stable memory without issues and can continue increasing that storage limit.

11 Likes

Another update: we did more testing with 400GB of stable memory and the tests were successful. DFINITY plans to propose in an upcoming replica version to increase from 96GiB to 400GiB. More increases are planned later this year.

32 Likes

Really awesome! Thanks for pushing this one Dfinity.

3 Likes

When can we also expect storage specific cannisters so I can get my data off of Google drive. At $5gb/year, itā€™s just too costly.

I get 200 gb/year at $30 with gdrive

2 Likes

Would be applicable for Motoko Canisters as well? Usable ?

I can only comment from the engineering side, since thatā€™s where I work: we are working on increasing storage capacity more. I am not aware of any plans to decrease the cost.

I think it should be accessible to Motoko using the low-level API that works directly with the stable memory. Here is the Motoko-specific discussion: Motoko stable memory in 2022

1 Like

DFINITY plans to propose in an upcoming replica version to increase from 96GiB to 400GiB

The replica proposal with the change: Proposal: 127031 - ICP Dashboard

5 Likes

I wonder what the plan is to increase heap memory. With DeAI, which Iā€™ve been eyeing, the LLM upload seems to hang due to lack of heap memory.
To increase heap memory beyond 4GB would require a change to 64-bit Wasm, which would be a daunting task, but a plan should be in place.

4 Likes

Thank you for bringing this up! Within the DeAI group weā€™ve indeed identified the heap memory size as a key limitation. Are there any plans for upgrading to 64-bit Wasm and increasing the heap memory with it, that you (or someone else on the team) could share at this point, @ulan ?

2 Likes

I donā€™t think this is a fair comparison with Google Drive. First, Google Drive is not replicating your data 13 times. Second, Google undoubtedly profits from your data in other ways. An e2ee on-chain alternative to Google Drive will likely never be able to compete on price because it canā€™t mine your data and profit from using that data for driving advertisement revenue. If you want a cheaper way to store your data without giving it all to Google then you need to host your own solution.

5 Likes

I mean itā€™s not just me who wants to replace legacy cloud. Dfinity and Dom profess everyone to ditch the traditional clouds of the world and start building on ICP. If we guys donā€™t compete on cost why would the bottom line-focused enterprises move to IC. Like the truly large-scale operations. They would need to keep their costs to a minimum.

Unless weā€™re building software which maybe expensive than normal but has different capabilities.

Just that we might not get the huge adoption as AWS/GCP as they compete on costs, which would matter to 95% of the businesses.

which maybe expensive than normal but has different capabilities.

Is this not what weā€™re doing already? An equivalent to Google Drive that leverages the IC fully would be decentralized under an SNS and private via vetkeys. Iā€™m personally willing to pay more for privacy and decentralization, but I canā€™t speak for everyone.

Youā€™re right that there will be people and businesses that just want to spend as little money as possible, but why would they want to build on a blockchain? Money aside, itā€™s quite a bit more complicated to build on the IC compared to traditional platforms so you need to have benefits that will outweigh that increase in complexity.