Canister Lifecycle Hooks

The motion proposal has been accepted. The feature is in the roadmap. We will start working on it after Proposal: Configurable Wasm Heap Limit.

1 Like

I think a hook that is triggered upon unfreezing a canister would be really helpful as well.

Imagine a canister that upon deployment starts calling itself indefinitely. At some point the cycle balance falls below the freezing threshold and the self call loop will be interrupted. The canister is topped up again and unfrozen, but now the self call loop has to be triggered manually. Ideally I can just specify a lifecycle hook for the event that a canister is unfrozen :slight_smile:

Do you have an example of this behavior? I believe it was fixed a while back, but if it hasnā€™t, letā€™s investigate furtherā€¦

Last time I used this to stop a canister that was trapped in a self-loop so I can upgrade it was in July last year. Maybe things have changed since then?
To clarify, Iā€™m not suggesting we change the behaviour, Iā€™d just be happy to have a lifecycle hook for a canister that becomes unfrozen.

I could reproduce the behaviour locally with this Motoko Canister

actor {
  var counter = 0;

  public query func getCounter() : async Nat {
    counter;
  };

  public func incrementCounter() : async () {
    counter += 1;
  };

  public func selfCallLoop() : async () {

    while true {
      await incrementCounter();
    };
  };
};
  1. deploy the canister
  2. call selfCallLoop
  3. call getCounter repeatedly to verify messages are being executed continiously
  4. increase freezing threshold such that canister is frozen
  5. decrease freezing treshold such that canister is unfrozen
  6. repeatedly call getCounter to verify counter is not being incremented anymore and therefore execution of selfCallLoop came to a halt from freezing the canister

Ideally I would be able to specify a lifecycle hook that is triggered as soon as the canister is unfrozen so I can make another call to selfCallLoop and continue my work of incrementing the counter.

1 Like

Iā€™m not a Motoko expert, but looking at the example from the IC specification perspective, it seems to be missing error handling.

Following the Motoko documentation on Asynchronous errors, here how the error handling in this loop could look:

while true {
    try {
        await incrementCounter();
    } catch (e) {
        switch Error.code(e) {
            case #system_transient {
                // Here goes the lifecycle logic.
            }
        }
    }
}

There might be other await errors beyond the ā€œunfrozenā€ hook would cover. Hereā€™s a scenario illustrating this:

  1. The await incrementCounter(); traps because the canister queue is full.
  2. The ā€œunfrozenā€ lifecycle hook will never be called because the canister is full of cycles.
  3. The selfCallLoop must still be manually triggeredā€¦
1 Like

Thatā€™s true, it misses error handling. I was just trying to give a quick example to better illustrate the use case for this hook :innocent:
But even with the error handling, once the canister broke out of the loop I have to manually trigger it. In Motoko the code above wonā€™t fill up the canister queue as every self call is a separate message.

once the canister broke out of the loop I have to manually trigger it

Maybe, setting a timer in error handler could be used to ā€œtry again laterā€?

the code above wonā€™t fill up the canister queue as every self call is a separate message

I agree, but in general there are 9 system transient errors. And these should all be handled the same way - by retrying them laterā€¦

What is a ā€œtransientā€ error?

Maybe, setting a timer in error handler could be used to ā€œtry again laterā€?

I donā€™t think there is an error for a frozen canister thrown :thinking:
This code does not show anything when calling dfx canister logs after repeating the steps mentioned in the previous message. Also timers stop executing and donā€™t resume if a canister is below the freezing threshold.

import Error "mo:base/Error";
import Debug "mo:base/Debug";
import Timer "mo:base/Timer";

actor {
  var counter = 0;

  func printCounter() : async () {
    Debug.print(debug_show (counter));
  };

  ignore Timer.recurringTimer<system>(#seconds 2, printCounter);

  public query func getCounter() : async Nat {
    counter;
  };

  public func incrementCounter() : async () {
    counter += 1;
  };

  public func selfCallLoop() : async () {

    while true {
      try {
        await incrementCounter();
      } catch (e) {
        Debug.print("this was caught: " # debug_show (Error.message(e)));
      };
    };
  };
};

So without the hook, how would that work?

What is a ā€œtransientā€ error?

Here is the spec definition, and here is how itā€™s defined in the code.

For the rest, Iā€™m not an expert in Motoko, maybe @ggreif or @claudio could have a look?

The similar Rust example would look like:

thread_local! {
    static COUNTER: std::cell::RefCell<u64> = 0.into();
}

#[ic_cdk::query]
fn get_counter() -> u64 {
    COUNTER.with_borrow(|c| c.clone())
}

#[ic_cdk::update]
fn inc_counter() {
    COUNTER.with_borrow_mut(|c| *c += 1)
}

#[ic_cdk::update]
fn self_call_loop() {
    ic_cdk_timers::set_timer_interval(std::time::Duration::from_secs(1), inc_counter);
}

And it survives the freezing/unfreezing just fine without going into the reject code details, as those are handled by the timers library.

1 Like

Very cool, I didnā€™t know Rust and Motoko behaved that differently regarding timers. This is what I wanted.

use ic_cdk::{api::call, println};

thread_local! {
    static COUNTER: std::cell::RefCell<u64> = 0.into();
    static IS_CALLING_ITSELF  : std::cell::RefCell<bool> = false.into();
}

#[ic_cdk::query]
fn get_counter() -> u64 {
    COUNTER.with_borrow(|c| *c)
}

fn print_counter() {
    println!("Counter: {}", get_counter());
}

#[ic_cdk::init]
fn init() {
    ic_cdk_timers::set_timer_interval(std::time::Duration::from_secs(2), print_counter);
    ic_cdk_timers::set_timer_interval(std::time::Duration::from_secs(5), || {
        ic_cdk::spawn(async { self_call_loop().await })
    });
}

#[ic_cdk::update]
fn inc_counter() {
    COUNTER.with_borrow_mut(|c| *c += 1)
}

#[ic_cdk::update]
async fn self_call_loop() {
    if !IS_CALLING_ITSELF.with_borrow(|c| *c) {
        println!("Calling itself");
        IS_CALLING_ITSELF.with_borrow_mut(|c| *c = true);
        loop {
            match call::call::<(), ()>(ic_cdk::id(), "inc_counter", ()).await {
                Ok(_) => {}
                Err(e) => {
                    println!("This was caught: {:?}", e);
                    IS_CALLING_ITSELF.with_borrow_mut(|c| *c = false);
                    break;
                }
            }
        }
    } else {
        println!("Already calling itself");
    }
}

I assume the break statement in the loop is important because otherwise the canister will still try to call inc_counter indefinitely, depleting itā€™s cycles?

Also, do the failed canister_global_timer still costs cycles?

I found this in the interface specification, how does the Rust timers implementation work around that :thinking:

The global timer is also deactivated upon changes to the canisterā€™s Wasm module (calling install_code , install_chunked_code , uninstall_code methods of the management canister or if the canister runs out of cycles).

I assume the break statement in the loop is important because otherwise the canister will still try to call inc_counter indefinitely, depleting itā€™s cycles?

I have a feeling that itā€™s a XY problem. What do you think about starting a new topic describing the problem weā€™re trying to solve?

Also, do the failed canister_global_timer still costs cycles?

Any message execution cost cycles.

how does the Rust timers implementation work around that

There is no workaround in Rust timers library, and the timer will be deactivated on the listed cases. Some workarounds could be implemented on the app level though.

Clarifying question about the current implementation of Canister Lifecycle Hooks.

Are they one-time events, or can they be fired multiple times?

Take this example of the heap memory lifecycle hook:

  1. My canister crosses the heap memory threshold ā†’ hook is fired
  2. Garbage collection runs (passive) freeing up heap memory ā†’ canister is back under the heap memory threshold
  3. Canister writes more data and crosses the heap threshold again ā†’ is the hook fired again?

Similarly, this same question applies to the low cycles balance lifecycle hook.

Say I top up a canister with a micro cycles payment every time it falls below the cycles threshold. Is there a maximum # of times that a lifecycle hook can be fired within a given time period (checkpoint)?

In the example you provide, yes, the hooks will be fired multiple times as the condition was hit multiple times. However, if the condition is hit and nothing changes (i.e. your canister remains at high memory usage or low in cycles), the hook will not trigger again.

1 Like

Wanted to circle back here to see if the upcoming canister_on_error lifecycle hook will expose an API for viewing the canister backtrace associated with a trap.

1 Like

We can consider exposing this. The canister_on_error hook is anyway not under development at the moment so we will most likely have canister backtraces before we start working on canister_on_error (and thus before we need to decide what the canister_on_error has access to).

1 Like