So I am learning motoko and still confused with the concept of using stable memory in a motoko canister.
Currently my understanding about the whole memory process is as follows:
When a canister is running it will be storing the canister data in the volatile heap memory, which will be lost on every canister upgrade, and that memory is about 4GB right now. But the canister aslo have stable memory that is 48GB right now, and then if the data variables in the canister are stable, during a canister upgrade, the data will not be lost, it will be stored in the stable memory and be restored to be used in the canister heap memory again.
We have stable and unstable data types, for example data types that are classes from the base library like HashMaps, Buffers, are unstable and can not be declared with the word stable.
So for stable types we just declare them with the word stable and during the canister upgrade the data of their variables will be serialized into internal stable memory and then decerialized and restored to their variables after the upgrade back to the heap memory of the canister. When working with non-stable types like HashMaps we can accoumplish the same by simply using the system pre and post upgrade methods.
So that’s my basic understanding and knowledge right now which obviously is lacking something and might not be accurate.
My questions and confusion are this:
Since the data in the actor variables is stored in stable memory during upgrades and then returned back to their variables in the heap memory after upgrades, and the heap memory is only 4GB, what happens when my actor variables are holding data that is more than 4GB? Does the extra data like automatically overlap into the stable memory or?
When canisters are running, do they only have access to their heap memory and can only use and access stable memory during canister upgrade or when they are running they can also have access to stable memory?
When we declare variables as stable , does it mean they are automaticaly storing their data in the stable storage or it just means during upgrade thier data will automatically be stored in stable memory and retrieved after upgrade without the need to use system preupgrade and postupgrade functions?.
Because the stable variable data ordinarily resides in the heap, it can’t use more that 4GB anyway and should (in most cases, but, unfortunately, not all) deserialize back into less that 4GB. The cases where this can still go wrong is that the stable storage format does not preserve all sharing, just sharing of mutable data. So if you have large immutable graph in stable variables, with lots of internal node sharing, it can get serialized into a larger tree and then, on deserialization, might not actually fit in 4GB anymore.
Canisters can also access stable memory directly using the very low level ExperimentalStableMemory library. This gives you almost direct access to stable memory but requires manual memory management of a global, shared resource. This can be used in conjunction with stable variables.
We are in the process of extending and replacing this functionality to support carving stable memory into isolated, independently growable regions that different libraries can use without risk of accidental interference with other libraries.
At the moment, when you declare variables stable it means their data is stored in the 4GB main heap until upgrade, when it is serialized to stable memory for deserialization after the upgrade. We’d like to move to a world where the serialization step is not necessary, by storing stable variables in stable memory throughout, not just during upgrades, but it’s early days. For large amounts of stable data, the serialization step limits scalability.
The current design does let one use a hybrid approach, using a moderate amount of stable variable data to store metadata about larger amounts of raw data stored in (Experimental) stable memory. So one could image implementing, say, a simple file system, using stable variables to store file descriptors and block allocation tables, and stable memory to store the actual blocks. I haven’t seen anyone use this approach in anger though.