Increased Canister Smart Contract Memory

peterparker · May 27, 2023, 2:15pm

Absolutely.

Coolio. https://github.com/dfinity/stable-structures/pull/83

peterparker · May 27, 2023, 5:05pm

I struggle with the MAX_SIZE variable. On one hand, my lack of passion for backend code and confidence makes me worry about calculating the wrong value (). On the other hand, as mentioned earlier, developers using my tool may upload unpredictable sizes due to the use of blobs. Additionally, I’m uncertain about the consequences of setting a value now and realizing, after two months, that it should be increased.

In summary, the MAX_SIZE option is not necessarily a deal-breaker, but it’s definitely a big challenge for me which I would be happy to spare.

So, is it acceptable to set a random large value for anything, like 64 GB, or would this cause any issues?

peterparker · May 28, 2023, 11:44am

Dumb question, what do you mean with “composite key approach”? Like a key that contains multiple information?

I’m interested by the question of handling vector dataset in StableBTreeMap because I would like to convert e.g. an Hashmap of BTreeMap.

// Basically
pub type Collection = BTreeMap<MyEntityKey, MyEntity>;
pub type State = HashMap<CollectionKey, Collection>;

I use such pattern because developpers can define their own keys - the canisters I provide is generic. Therefore as you pointed out, serialize/deserialize the entire HashMap won’t be performant.

So if I get it right, your advice would be to flatten the above to a single StableBTreeMap in which the key contains basically both CollectionKey and MyEntityKey, correct?

Side note: or is it actually possible to create StableBTreeMap on the fly (at runtime)?

ielashi · May 30, 2023, 11:36am

I totally understand. Setting a reasonable MAX_SIZE is quite a significant design decision, and it’s not very obvious what value to set in many cases.

No. The BTree always allocates the maximum size of the value, so if you hypothetically put 64GB, you’ll run out of memory very quickly The trade-off here is between memory usage and flexibility in the future.

Yes, exactly, and I’d still recommend this pattern even when we remove the MAX_SIZE requirement from the BTreeMap in the near future.

Given the current MAX_SIZE limitation though, there’s a trick where you can split your unbounded data into chunks. For example, you can have the following BTree:

StableBTreeMap<(CollectionKey, MyEntityKey, ChunkIndex), Blob<CHUNK_SIZE>>

In the above BTree, we split MyEntity (the unbounded value) into chunks of size CHUNK_SIZE, where CHUNK_SIZE is some reasonable value of your choice.

For illustrative purposes, let’s say you have a CHUNK_SIZE of 2, and you’d like to store the entities: (key_1, [1,2,3]) and (key_2, [4,5,6,7,8]).

In this case, the stable BTreeMap above will look like the following:

StableBTreeMap => {
  (collection, key_1, 0) => [1, 2],
  (collection, key_1, 1) => [3],
  (collection, key_2, 0) => [4, 5],
  (collection, key_2, 1) => [6, 7],
  (collection, key_2, 2) => [8],
}

In theory, yes, but in practice, not currently, because each StableBTreeMap requires its own virtual address space, so creating them does incur an overhead, so I’d recommend against this approach.

peterparker · May 30, 2023, 5:24pm

Got it! Thank you for the detailed explanation, everything is clear now. Although I’m confident that someone smarter than me could certainly solve my requirements using chunking as you suggested, as I don’t have any time constraints, I’ll wait for the MAX_SIZE removal in the near future.

lastmjs · June 5, 2023, 4:13pm

I’ve suggested this before, but I want to bring it up again. It would be nice if we could agree on a schedule for increasing the stable memory size, for example increasing it xGiB per quarter or per year based on some agreed-upon criteria. Right now it seems to just increase when DFINITY has a need for it to increase. Having this schedule would help us plan out future capabilities for our users of Azle, Kybra, and Sudograph.

ulan · June 6, 2023, 8:36am

We also received requests from the community to increase the stable memory. We need to do thorough testing and make sure there are no technical issues before committing to a fixed schedule. Since it is a non-trivial amount of work, I think we can allocate time for it in July/August. Would that work for you or is it more urgent?

lastmjs · June 6, 2023, 2:32pm

It’s not urgent for us, we just want to push to get on a schedule for our future selves.

ielashi · September 6, 2023, 7:26am

An update on that front: DFINITY plans to propose in an upcoming replica version to further increase stable memory from 64GiB to 96GiB. So far our biggest canister on the IC, the Bitcoin canister, has been pushing close to 64GiB, and the growth in Bitcoin’s UTXO set has been accelerating. With further testing we can see that we can support a larger stable memory without issues and can continue increasing that storage limit.

ulan · January 8, 2024, 2:52pm

Another update: we did more testing with 400GB of stable memory and the tests were successful. DFINITY plans to propose in an upcoming replica version to increase from 96GiB to 400GiB. More increases are planned later this year.

Maxfinity · January 8, 2024, 7:31pm

Really awesome! Thanks for pushing this one Dfinity.

Sawyer · January 9, 2024, 4:09pm

When can we also expect storage specific cannisters so I can get my data off of Google drive. At $5gb/year, it’s just too costly.

I get 200 gb/year at $30 with gdrive

TusharGuptaMm · January 9, 2024, 8:37pm

Would be applicable for Motoko Canisters as well? Usable ?

ulan · January 9, 2024, 8:52pm

I can only comment from the engineering side, since that’s where I work: we are working on increasing storage capacity more. I am not aware of any plans to decrease the cost.

I think it should be accessible to Motoko using the low-level API that works directly with the stable memory. Here is the Motoko-specific discussion: Motoko stable memory in 2022

ulan · January 15, 2024, 1:27pm

DFINITY plans to propose in an upcoming replica version to increase from 96GiB to 400GiB

The replica proposal with the change: Proposal: 127031 - ICP Dashboard

hokosugi · January 16, 2024, 4:54am

I wonder what the plan is to increase heap memory. With DeAI, which I’ve been eyeing, the LLM upload seems to hang due to lack of heap memory.
To increase heap memory beyond 4GB would require a change to 64-bit Wasm, which would be a daunting task, but a plan should be in place.

patnorris · January 18, 2024, 3:40pm

Thank you for bringing this up! Within the DeAI group we’ve indeed identified the heap memory size as a key limitation. Are there any plans for upgrading to 64-bit Wasm and increasing the heap memory with it, that you (or someone else on the team) could share at this point, @ulan ?

NathanosDev · January 18, 2024, 3:47pm

I don’t think this is a fair comparison with Google Drive. First, Google Drive is not replicating your data 13 times. Second, Google undoubtedly profits from your data in other ways. An e2ee on-chain alternative to Google Drive will likely never be able to compete on price because it can’t mine your data and profit from using that data for driving advertisement revenue. If you want a cheaper way to store your data without giving it all to Google then you need to host your own solution.

Sawyer · January 19, 2024, 6:13am

I mean it’s not just me who wants to replace legacy cloud. Dfinity and Dom profess everyone to ditch the traditional clouds of the world and start building on ICP. If we guys don’t compete on cost why would the bottom line-focused enterprises move to IC. Like the truly large-scale operations. They would need to keep their costs to a minimum.

Unless we’re building software which maybe expensive than normal but has different capabilities.

Just that we might not get the huge adoption as AWS/GCP as they compete on costs, which would matter to 95% of the businesses.

NathanosDev · January 22, 2024, 4:44pm

which maybe expensive than normal but has different capabilities.

Is this not what we’re doing already? An equivalent to Google Drive that leverages the IC fully would be decentralized under an SNS and private via vetkeys. I’m personally willing to pay more for privacy and decentralization, but I can’t speak for everyone.

You’re right that there will be people and businesses that just want to spend as little money as possible, but why would they want to build on a blockchain? Money aside, it’s quite a bit more complicated to build on the IC compared to traditional platforms so you need to have benefits that will outweigh that increase in complexity.

Topic		Replies	Views
Improving the utility of stable memory Roadmap	15	2595	December 10, 2022
Big Map (for scaling canisters across subnets) Roadmap Discussing	16	4192	September 29, 2021
SNS Update: Feb 9, 2024 SNS Project Governance	3	720	March 25, 2024
Memory Allocation Explained Developers	6	1813	December 7, 2021
Two questions about canister storage Developers	30	3532	March 9, 2022

Increased Canister Smart Contract Memory

Related topics