Motoko Profiling questions

Good day, I want to perform profiling on stable data structure libraries in Motoko Language for my research. I would like to ensure that the measurements are valid by asking for your clarification on a number of nuances.
Sorry, this will be quite a mouthful :grin:.

Measured parameters:

1) Stable memory usage for different number of elements: I want to use ExperimentalStableMemory.stableVarQuery as Claudio suggested to measure stable memory used:

a) When I run this on an empty canister (without any variables or funcs) - it always shows me value of 9 bytes, should I just subtract 9 from all the next measurements?

b) I tried using it for measuring simple stable variables, when I make an array: “stable var t: [Nat64] = [1, 2, 3, 4];” and then call stableVarQuery, after subtraction of 9 I get 40, regardless of the size of the array (tried adding a lot more elements or initing an empty arr ). Should it not take 8 bytes per each element (are the measurements of stableVarQuery always accurate?). Source Code

2) Heap memory usage for different number of elements: I want to use rts values, as alexeychirkov did to measure heap memory used. I use ‘–force-gc’ in compiler args as mentioned here to make gc run at the end of message and ensure correct measurements:

a) I tried using it to measure simple variables, like array: ‘var t: [Nat64] = [1, 2, 3, 4];’, but no matter the size of array, it always gives me ‘rts_heap_size = 44’ like in empty canister, should it not be 8 bytes per each element? Source code

b) Is the cycle cost for storing memory on the canister calculated as ‘({Stable memory used} + {rts_heap_size}) * {cycle cost per byte}’ or is it ‘{rts_memory_size} * {cycle cost per byte}’ for all allocated memory as mentioned by rossberg. Just want to clarify, because the answer was in 2022.

3) Cycle cost for calling update functions: I want to use ExperimentalInternetComputer.countInstructions:

a) It says that it does not not account for any deferred garbage collection costs incurred by the function that is measured. If еhe operations are performed on a stable variable, how can I measure the gc costs? I want to use incremental gc. Maybe we should use {rts_reclaimed}, but what do we multiply it by to obtain cycle cost?

1 Like

a) When I run this on an empty canister (without any variables or funcs) - it always shows me value of 9 bytes, should I just subtract 9 from all the next measurements?

9 bytes should not really matter, so I wouldn’t bother subtracting.

b) I tried using it for measuring simple stable variables,

In the source code that you link, the variable is not declared stable (a typo?) so its contents won’t be considered for the size computation.

a) I tried using it to measure simple variables, like array: ‘var t: [Nat64] = [1, 2, 3, 4];’…

I think what is happening here is that the constant value [1,2,3,4] is statically allocated (i.e. preallocated) by the compiler so won’t contribute to the dynamic heap size. Try to use a dynamically create array, something like Array.tabulate<Nat64>(1000, fun i { i }); and see if that alters the rts_heap_size reading.

b) Is the cycle cost for storing memory…

The cost should depend on both the actual wasm and stable memory size so some function of rts_stable_memory_size (in 64KiB pages) and rts_memory_size (in bytes). What the precise function is I’m not sure, but perhaps one can figure it out from here Paying for resources in cycles | Internet Computer

a) It says that it does not not account for any deferred garbage collection costs incurred by the function that is measured…

countInstructions(f) can only measure instructions used during the execution of the functional argument f(), but GC (which happens after a message) and other things like message argument deserialization and result serialization happen outside of that, so can’t be measured with any f().

As a last resort, you can, however, use rts_mutator_instructions and rts_collector_instructions to retrieve the number of instructions used by the last completed message (not the current message). The former should include the cost of argument/result deserialization and serialization, and any user computation, while the latter should include the GC work, of the last completed message of the actor. Probably a bit tricky to use though.

1 Like

Thank you greatly for your answer, Claudio!

I think I will be using rts_mutator_instructions and rts_collector_instructions to measure instructions used by functions. I’ve been experimenting with them a bit by calling query function after the update execution has ended with ‘–force-gc’ flag and it seems that rts_mutator_instructions counts instructions for the last message, while rts_collector_instructions counts instructions for the whole time the canister was running, as it only increases.

Looking forward to using this for my measurements!

Queries never commit state changes, omit GC (because there is no point), and can’t update the state needed to track mutator/collector instructions, so don’t use a query for those.

I think you can only use these two to measure the cost of an update method that does not itself make any further calls. Maybe try that first.

1 Like

Thanks, will keep that in mind. :slightly_smiling_face: