Garbage collection when using the Motoko Playground

Was just playing around with a Motoko playground example and before deploying it to the IC I received two options for the garbage collector (GC) strategy, copying or marking GC.

I’m assuming copy refers to the mark-copy GC and mark refers to the mark-sweep GC.

I don’t have much experience with garbage collection, so looked into a few resources that demonstrate these differences from a quick Google Search → they seem to favor the mark-copy algorithm, but also point a few less likely cases where mark-sweep outperforms.

A few questions on the the GC for Motoko, and some others which have been provoked by staring at this UI :slight_smile:

  1. How should Motoko developers think about garbage collection on the IC and how can we use it to our benefit in terms of message throughput performance, memory allocation, and cycle management with respect to storage costs?

  2. How does the Motoko playground force the canister that gets deployed to the IC to use garbage collection? How does “forcing” garbage collection affect the performance with respect to the metrics mentioned in question 1?

  3. On a separate note, what does the “Enable profiling” checkbox do?

@claudio @chenyan

1 Like
  1. How should Motoko developers think about garbage collection on the IC and how can we use it to our benefit in terms of message throughput performance, memory allocation, and cycle management with respect to storage costs?

Roughly, the cost of copying GC depends on the size of the actively used memory, and the cost of marking GC depends on the size of the freed memory. So if the canister stores a lot of data, marking GC is preferred. Ideally, we should have a scheduler to decide which GC strategy to use based on the current memory usage, instead of fixing the strategy in the compile time.

  1. How does the Motoko playground force the canister that gets deployed to the IC to use garbage collection? How does “forcing” garbage collection affect the performance with respect to the metrics mentioned in question 1?

By default, the runtime won’t invoke the GC if the used memory is small. For most of the small changes, we won’t trigger the GC. “Forcing” garbage collection ensures we run GC for each message, so that we can measure the impact of the GC for small changes.

  1. On a separate note, what does the “Enable profiling” checkbox do?

Profiling instruments the canister code to produce a flamegraph for update methods. If you enable profiling, and call an update method in the Candid UI, you will see the flamegraph for this method. Before we have system level support, this only works with small cycle changes.

1 Like

How do you force garbage collection on each message when compiling?

There is a --force-gc flag for the moc compiler.

1 Like

Is this exposed through dfx, or would we have to switch to the moc compiler?

I don’t think we exposed this in dfx.json. You will need to call moc directly.

But one hack you can do is to use the packtool field, e.g.,

  "defaults": {
    "build": {
      "packtool": "--force-gc"
    }
  },
1 Like

Is there any special setup I need when doing this? I’ve done it and when trying to build a Motoko canister I’m getting:

Building canisters...
Error: Failed while trying to build all canisters.
Caused by: Failed while trying to build all canisters.
  The build step failed for canister 'rrkah-fqaaa-aaaaa-aaaaq-cai' (motoko) with an embedded error: Failed to build Motoko canister 'motoko'.: Failed to load package arguments.: Failed to invoke the package tool "--force-gc"
 the error was: No such file or directory (os error 2)

Where is the documentation for packtool?

This seems to work:

  "defaults": {
    "build": {
      "packtool": "",
      "args": "--force-gc"
    }
  },
1 Like

This does compile, is there a way to confirm that the gc is being run after every message?

Not easy. You could compare the cycle cost with and without the flag. OTOH, you can know if the args fields is applied correctly to moc by changing the flag to some random string. Then dfx build will get an error and output the exact command when it calls moc.

1 Like

Oh cool I was able to confirm that a bad argument caused an error. Also, is there any practical reason someone would want to force gc in a production canister? Or is it just for profiling? Basically, could there be performance gains from running the gc on every call?

Normally not. But if you want predictable performance, you may want to force gc, so that message processing time is proportional to the actual computation cost. Without forcing GC, you may observe a small request suddenly takes a lot of cycles, because the GC is triggered and have to do all the work left from previous messages.

1 Like

Yes makes sense, thank you!