How does canister storage get implemented

Hello ICP folks,
Since the launch of ICP, I have always been curious about how persistent canister memory is implemented. Since most blockchain use kv DB(level DB, rocks DB) to serialize and deserialize tree nodes, you can’t use straightforward data structures in smart contracts. But ICP supports almost all data structures in a native fashion in memory. So my biggest question is how does canister’s persistent storage gets implemented since I am not sensing any persistent RAM in hardware configuration. Does ICP use Memory Map for mirroring memory in SSD, if so, what data structures it uses (mpt?)

1 Like

It would be a great help if someone could post the state implementation source code

1 Like

I wrote a blogpost about the orthogonal persistence feature responsible for canisters state [Blog post] IC internals: orthogonal persistence
There are some code pointers at the end of the article if you’re brave :slight_smile:

1 Like

thank you roman! That’s super helpful!

1 Like

Hi Roman, one more question, how does dirty memory pages further impact state_root_hash calculation and state_change_hash, could you elaborate on it a little bit. Bascially, how the state_root_hash is derived from the canisters memory pages

1 Like

Conceptually, the system periodically hashes the entire state of a subnet, including the memory of all canisters, by constructing an on-disk representation of the state, slicing the files into chunks, and building a shallow Merkle tree out of this structure (state as an artifact). I guess the root hash of that tree is what you mean by state_root_hash.

Of course, re-hashing the whole state is very expensive, so the system uses the information about dirty pages to find chunks that need re-hashing.

I’m not sure what you mean by state_change_hash though.

1 Like