Is there any document about the amount of memory occupied by the basic types of Motoko canister?
such as a Nat8 array with n size, a principal, a enum, a HashMap and so on.
Is there any document about the amount of memory occupied by the basic types of Motoko canister?
such as a Nat8 array with n size, a principal, a enum, a HashMap and so on.
In the world of canisters and cycles… every byte counts.
No, I don’t think such a document exists. The ASCII-art and comments in motoko/compile.ml at master · dfinity/motoko · GitHub is maybe the best place right now, but this is of course written with the compiler developer in mind. And there are a bunch of optimizations (e.g. small numbers are not stored as pointed-to objects, but “instead of” the pointer) that make such predictions harder.
To answer your concrete questions:
[Nat8] with n elements will take 8+4×n bytes.() stored inside), then the () is “free”.
Thanks for your great answer,
How about Nat32, Nat64, Nat, Int, Text, List, Option ?T, Char
Nat32, Char: 8 bytesInt64: 12 bytesNat and Int: “free” if smaller than 2^30, else at least 20 bytes, up to arbitrary sizesText and Blob: up to 8+n+3 bytes.opt: “free” unless you deal with the value ?…?null
List is not a basic type. Probably 12*nFree means that it’s stored inside the containing data structure without extra allocation.
Does this also apply to the RUST canister?
No, he’s talking about Motoko only.
Just curious: why does a Nat32 occupy 8 bytes?
Every heap-allocated object has a 1-word (4-bytes) header, followed by the payload, which, in this case, is 4 bytes:
Does it specific only for motoko compiled module? I mean if I build a module using rust what size whould be for byte array ?
@luc-blaeser, @ggreif, @claudio
Can we get a 2025 breakdown of this? Is it different with 64 bit?
It is different with 64 bit in the sense that the word size is now 64-bit, not 32-bit, so most values double in size. However, more values can be represented unboxed, without a heap indirection, which also saves space for smaller values.
In all gcs, small values less than the word size are encoded as unboxed words. Larger values that cannot fit in a word are heap allocated and represented by a tagged pointer (another word) to the heap location. The location on the heap contains a word sized tag to identify the type of the object (record, variant, text, blob, boxed (singed or unsigned) full-word, mutable array, immutable array, tuple). For an incremental GC like the 32-bit incremental gc or eop GC, each heap allocated block contains an additional forwarding pointer, used by the GC to move objects between memory partitions.
I’ll add it to our todo list to documents this more clearly, though it is an implementation details that changes (e.g. with the choice of GC).
With eop, the tags are actually more informative than with the 32-bit gc, and even unboxed values contain tags to disambiguate their types (for other reasons to do with last-resort upgrades via serialization). This info is not user accessible but used by the runtime system (and could be used fruitfully by a debugger)