Sleeping in update calls

Introduction

This post outlines a solution for sleeping in update calls, i.e., allowing an update call to await a condition (to become satisfied) without spinning. A common use case looks as follows:

  • a periodic timer (e.g., every 10s) makes a (canister http) call to fetch some data in the background;
  • an update call must await the data fetched by the periodic timer before proceeding;

or:

  • an update call contains a critical section guarded by a lock and must await the lock to be acquired before proceeding.

Currently, awaiting a condition can be achieved by making inter-canister calls in a (tight) loop until the condition is satisfied which results in excessive compute and cycles consumption. Hence, we propose a solution to avoid that ā€œbusyā€ waiting approach by following a ā€œone-timeā€ observer pattern implemented by the management canister.

We appreciate your feedback on the proposed API to make sure the API can cover as many use cases as possible.

New management canister endpoints

Management canister endpoint: observe

A new management canister endpoint observe is introduced to await a condition.

  • This endpoint can only be called by controllers of the given canister or the canister itself.
  • This endpoint can only be called by bounded-wait calls.
  • The call returns upon a matching call to the notify endpoint (see below).
  • Any cycles attached explicitly to this call are refunded.

Input

record {
  canister_id : principal;
  event_id : nat64;
};

Output

()

Management canister endpoint: notify

A new management canister endpoint notify is introduced to notify the observers, i.e., to make the calls to the endpoint observe return.

  • This endpoint can only be called by controllers of the given canister or the canister itself.
  • The call returns immediately. The reply contains the number of observers that are notified, i.e., the number of pending calls to the endpoint observe with a matching canister_id and event_id.
  • Any cycles attached explicitly to this call are refunded.

Input

record {
  canister_id : principal;
  event_id : nat64;
};

Output

record {
  num_observers: nat64;
};

Examples

The new management canister endpoints observe and notify described in the previous section can be used as follows:

  • a periodic timer (e.g., every 10s) makes a (canister http) call to fetch some data in the background and calls notify for the same canister and a fixed event_id every time new data are fetched;
  • an update call must await the data fetched by the periodic timer before proceeding so the update call calls observe for the same canister and the fixed event_id.

To implement awaiting a lock, the new endpoints can be used as follows:

  • when a lock is being acquired, then
    • either the lock is acquired immediately or
    • a fresh ticket is generated and stored in the state (of the lock) and the canister calls observe for the same canister and the generated ticket (event_id);
  • when a lock is being released, then
    • if there are pending tickets stored in the state of the lock, then one is chosen (e.g., FIFO), the lock stays acquired, and the canister calls notify for the same canister and the chosen ticket (event_id) – waking up a single observer identified by the ticket that now holds the lock;
    • if there’s no pending ticket stored in the state of the lock, then the lock is released.

This locking algorithm is implemented by the following Rust code (omitting error handling of conditions such as out of cycles or memory for the sake of simplicity):

use candid::Principal;
use ic_cdk::api::canister_self;
use ic_cdk::update;
use std::cell::RefCell;
use std::collections::VecDeque;

async fn observe(_canister_id: Principal, _event_id: u64) {
    todo!() // new management canister endpoint
}

fn notify(_canister_id: Principal, _event_id: u64) {
    todo!() // new management canister endpoint
}

thread_local! {
  static LOCK: RefCell<Lock> = RefCell::new(Lock::default());
  static NEXT_TICKET: RefCell<u64> = RefCell::new(0);
}

#[derive(Default)]
struct Lock {
    acquired: bool,
    tickets: VecDeque<u64>,
}

impl Lock {
    fn acquire(&mut self) -> Result<(), u64> {
        if self.acquired {
            let ticket = NEXT_TICKET.with(|t| {
                let ticket = *t.borrow();
                *t.borrow_mut() = ticket + 1;
                ticket
            });
            self.tickets.push_back(ticket);
            Err(ticket)
        } else {
            self.acquired = true;
            Ok(())
        }
    }

    fn release(&mut self) {
        self.acquired = false;
        if let Some(ticket) = self.tickets.pop_front() {
            self.acquired = true;
            notify(canister_self(), ticket);
        }
    }
}

#[update]
async fn critical_section() {
    if let Err(ticket) = LOCK.with(|l| l.borrow_mut().acquire()) {
        // the lock implementation returns a fresh ticket
        // if the lock can't be acquired atomically
        observe(canister_self(), ticket).await
    }
    // critical section
    // ...
    // end of critical section
    LOCK.with(|l| l.borrow_mut().release());
}
17 Likes

Looks good! Really clean, really simple. I’m sure we’ll have a few use cases for this within OpenChat.

2 Likes

This would be perfect for REE, especially the Bitcoinciaga tokenized runes apparel project from OnChainGear. They’re building a bartering market on REE based on runes distributed through their premium apparel, so this should greatly improve their UX and avoid unnecessary cycles burnt.

Endorsed by Omnity eco director Sheldon Dearr :saluting_face:

1 Like

This is a system/management canister call? Is there a particular reason you would like to move this into that layer as opposed to handling this in software?

Is this intended to be a multi-canister and/or multi-subnet setup?

Is there some cost for the time that these are open and waiting?

We’ve discussed sleeping on the motoko side over here a bit: Status of a future in motoko and sleeping

In this case we just specifically wanted to wait ā€œonly to the next roundā€ to ensure that - perhaps - some futures settled and we could check them all again. Perhaps a bit of a different use case. In our case, the test would be Time.now() > whenObserveSet. I think we’d have to have a timer running every round to fulfil this which would be its own cycle drain.

What would be nice is if I could get await systemcanister.sleep(#timestamp(now()+1)) that would(except in strange circumstances) get called the next round). Or another option await systemcanister.sleep(#rounds(8)) that would sleep exactly 8 rounds(I’m not sure how this would work if your subnet was bob’d up…maybe it is your next chance after 8 rounds). If you added this to await systemcanister.sleep(#observer(hash(myticket))) that returned after your notify proposition you’d cover most of the scenarios I’ve run across. The problem with JUST observe is that you are reliant on other code to run which you may specifically be trying to avoid. Of course the other problem with timestamp and rounds would be more burden on the replica to sift through these these things each round(and if it is multi-subnet you’ll need to gossip them).

Inverting the problem we’ve done extensive work on a standard and a reference implementation for icrc-72 which seems that it may solve a similar issue, but instead of awaiting in-line(which I admit has it’s niceties including preserved context - at the expense of keeping the context open), asks the developer to adopt the multi-cast pub-sub pattern. Particular care was taken to make sure the pattern worked both for simple single-canister pub-sub and cross subnet super-broadcast(The reference implementation takes care to batch broadcasts across subnets to minimize chatty-ness and cycle consumption) and provides an interface for notified canisters to pay their way with cycles upon notification.

In a single canister instance where you are just waiting for a http request to refresh some data you would use a single line to subscribe to notifications:

ignore await* icrc72_subscriber().subscribe([{
          namespace = "data_refresh"; 
          config = [];
          memo = null;
          listener = #Sync(func <system>(notification: ICRC72SubscriberService.EventNotification) : () {
            let #Map(data) = notification.data else return;
            let #Nat(refreshTimeStamp) = data[0].1 else return;
            if(refreshTimeStamp > lastTimeStamp) doSomething();
            return;
          });

The timer component would register a publication and then publish it:

//set up publication
let result = await* icrc72_publisher().registerPublications([{
      namespace = "data_refresh";
      config = [
        (ICRC72OrchestratorService.CONST.publication.publishers.allowed.list, #Array([#Blob(Principal.toBlob(thisPrincipal))])),
        (ICRC72OrchestratorService.CONST.publication.controllers.list, #Array([#Blob(Principal.toBlob(thisPrincipal))]))
      ];
      memo = null;
    }]);

//set up timer
  private func getData() : async () {
    setLock();
    switch(makeHttpOutcal()){
      case(#Ok(data)){
        latestData := data;
        let timestamp = natnow();
        ignore icrc72_publisher().publish<system>({
           namespace = "data_refresh";
           data = #Map([("timestamp", #Nat(timestamp))]);
           headers = [];
        });
    releaseLock();
  };

  Timer.recurringTimer<system>(#seconds 600, getData); 

The reference implementation we have in motoko allows one to have the publisher, subscriber, broadcaster, and orchestrator all instantiated on one canister…no need to actually expose the functions externally, it should just work. We’d love a rust implementation of the standard and it is currently on our roadmap, but we’re open to volunteers.

1 Like

Yes, it is a management canister call.

Because it cannot be handled in software (canister code) without busy waiting resulting excessive compute and cycles consumption.

The API can target other canisters that can be on different subnets.

There’s no additional cost: note that there is a requirement that the (potentially long-running) call must be a bounded-wait call.

Scheduling canisters in rounds is an implementation detail that should not be necessary to expose.

2 Likes

Except that it can. It seems you’ve proposed a system that requires external intervention to ā€˜notify’, in which case you’re just choosing a pattern to keep context open instead of having the ā€˜notify’ call a new function with a new context that could collect the relevant data itself. This requires no busy waiting. You just have to have an external trigger(either another canister calls in or a timer runs). You aren’t actually sleeping…you are waiting. There are lots of software solutions that don’t require constant polling, wasting cycles, or adding a new system-level API. If you are waiting for an HTTP outcall to return it can just emit an event when it is done or non-await a call to a canister method that only listens to calls from inside the house.

The API can target other canisters that can be on different subnets.

What are the performance implications of this? It seems that you’ve limited this a bit by only allowing the canister itself or its controllers to observe, but if I build a ā€˜observation tree’ using this API where I have a ledger for token LDG and allow a blackholed controller that allows only known-wasm relay canisters to be added to it and I chain those 10 at a time so that define canisters can all be notified for every transaction that occurs on LDG, what is that going to do to the amount of data flying around the underlying system. A list of tuple of 10-29 byte principals and a nat64 is not that big to gossip around, but unless you are going to do routing and keep track of who is observing who you’re going to have to gossip the notifications to one subnet to ALL other subnets.

If we get to canister IDs that are 12 bytes and we have a Nat64, that is 20 bytes. If we have 50k canisters observing a ledger then we’re talking about 1MB more of data that needs to be shipped around to all the subnets for each observation(It would cascade through the listening tree, but still…if you’re listening to something repetitive and occurring often it will average out to somewhere above that number). That is just for one use case of multi-cast. I remember there was something about 4MB limit to blocks at some point but that was years ago and I know there have been optimizations. I remember we discussed it with @free.

There’s no additional cost: note that there is a requirement that the (potentially long-running) call must be a bounded-wait call.

What is the bounded wait limit? As in the above example, if we have 50,000 canisters observing a ledger for events and they need to refresh their observable every 30 seconds it may be a lot of load on the ā€œsystem canisterā€. Obviously, this is less load than at the replica level than doing the notification through software and if this API had existed 3 years ago I would not have spent 3 years putting together software and tokenomics for the whole thing to finance itself. When I hear the cost is 0 I get a bit sweaty(both form excitement and stress) because I personally have envisioned many, many scenarios where people want their canisters to react to data events across the IC and most of them end up in a discussion about how much bandwidth the IC can handle how we keep from impacting other canisters on the subnet.

Scheduling canisters in rounds is an implementation detail that should not be necessary to expose.

Except that I did just give one. If I want to keep my context and have my code pick up in the next round, I currently have no option. I have to await a balance call to the ICP ledger or something and if it crosses a subnet boundary I’m not guaranteed for it to run in the ā€˜next’ round. If I call myself, sometimes I’m scheduled in the same round. (waiting for futures to settle due to the 400 outgoing calls at one time limit is the classic example that almost anyone who tries to design a scaleable system first runs into). If I know all my calls are going to other subnets then I know I can wait 3 rounds before I need to check if any of my futures are fulfilled. Beyond 1 or 3 it gets arbitrary really quickly and I’d agree that arbitrarily waiting 8 round might be a stretch to find a use case, but waiting until the next round or waiting 3 are concrete ā€œcould have used them for the last couple years and saved Ts of cyclesā€ use cases. If @ggreif let’s me check my future status soon I’ll even more use case for it.

Sleeping a set amount of time seems like a no-brainer. A timer with context. There are some strange things with context in motoko that seem to come to mind when you’re doing async and being able to sleep vs jump into a timer function could have some merits.

1 Like

One of the reasons why I’ve avoided mass inter-canister event sending so far is cost.

What are your estimates of the cost of this API, say for a notifier and per observer?

If this is much cheaper, @infu might be interested in using this as a more efficient version of the vectors implementation.

Also, do you have any intuitions on limits of the number of observers, and what the performance of this API might be for same/cross subnets?

1 Like

The cost of this API is the same as for a regular inter-canister call: ~300k for making the call and 5M + #instr. for executing the callback (on a 13-node subnet).

The limits would essentially be the same as for regular bounded-wait calls. We might introduce additional limits though to ensure fairness, e.g., preventing a single notification with many observers from slowing down the entire subnet.

The performance would correspond to the performance of calling the management canister on the same/different subnet. Of course, too many observers might cause increased latencies which is why there might be additional limits on the number of observers.

1 Like

As I understand it, an observe(<self>, <random_u64>) with a timeout of 10 seconds is the equivalent of a sleep(10s), as there will never be a corresponding notify() call.

I believe that limiting calls to the observe() endpoint to the canister itself and its controllers would also prevent the ā€œthousands of observers for a ledger eventā€ use case that Austin raised. But with such a limitation in place, I wonder how useful is it to be able to observe another canister: if I control all the canisters involved, I might as well have the canister producing the event explicitly notify everyone else instead of everyone observing it.

As for awaiting within a call context vs. getting notified via something like an update method call, there are clear use cases for that. E.g. simple retry with back-off, potentially with a way of shortcutting the wait (the notification). Without this (or a simpler sleep() API) the only option for retrying a call without relying on event loops and state machines is to do it in a tight loop.

If there’s no corresponding notify() call, then the call to observe would indeed finish with a timeout after 10s. But it could also finish earlier, if there was a corresponding notify() call before the timeout. (I’m not sure what you mean by ā€œthere will never be a corresponding notify() callā€.)

The difference is that the notification would run in a separate call context while the observe() call allows to continue in the same call context (i.e., ā€œsleepā€ in the current call context).

The restriction to controllers is for the sake of security. A canister can relax it by implementing a custom ā€œobserveā€ endpoint (a regular update method) that performs some validation (e.g., authorization, rate-limiting) and calls the management canister’s ā€œobserveā€ endpoint internally.

Absolutely!

Ahhh…that makes sense…just discard the timeout error and you have a regular sleep. I’m guessing await (with timeout = 0) systemCanister.observe(, random_u64); would run in the next round (or at least the next allocated round?). Any chance of it being scheduled immediately? Maybe you’d need to use 1?

Calls to the system canister have this same cost? If you are observing a canister on a different subnet, would you expect this to take at least a couple of rounds to ā€˜settle’ on the replica in the other subnet, meaning that you for sure not getting anything back for a couple of round?(Basically under the covers it operates like a cross-subnet call).

So you would be limited to 500 outgoing observables at a time, the same as with any other x-canister call?

This was a significant point of discussion during the ICRC-72 WG. This API is stateless so when you get it back you have whatever interesting data you still had in context. With 72 there is a balance of what data to send with a notification. Generally, you want to send as little as possible to save bandwidth/cycle cost, but then the subscriber needs to go pull any relevant data from elsewhere. But with 72 you do have the option of stuffing the relevant data in there if it is worth the cost.

This perhaps begs the question of if there a possibility of a kind of ā€˜permanent’ observation here. I can envision someone getting frustrated by using this because the first to hit the observe will ā€˜block’ any notifications until the receiver processes the notification and can initiate a new wait. I guess it wouldn’t work with the current API as it is dependent on the API…but I guess another API, something like how a timer works where you hand off an async handler function could use the fundamental underlying infrastructure you’re going to have to build, but without the context.

systemCanister.subscribe(canisterId, nat64, handlerFunction)

With this, you are guaranteed to not miss any notifications.

Use case: I have a wallet canister that wants to watch my account on a ledger. A ledger can .notify(self, hash(account)) for each transaction and it gets blasted to anyone listening. This is obviously very chatty and expensive and would have to have some kind of permanence cost. Thus the tokenomic we’re putting into ICaiBus. I’m happy to meet to talk through it. Before we started there was obviously a concern that we’d do a bunch of work and you guys would just stick it all in the replica. While this new API covers many of our use cases, there a bunch more and it would be worthwhile to talk through them before we pour another few hundred thousand dollars of R&D into it. :joy: I think you need the tokenomics to make the thing work and to control the bandwidth and to ease onboarding, but there isn’t much besides a bunch of time and attention that could not be implemented in the replica.


Perhaps a dumb question here, but if you don’t await one of these, it just creates the future and makes the call in the background and the notification comes back, but the notification goes into the ether right? I don’t know that this line of inquiry goes anywhere, and I don’t think that is useful for anything unless you want to attack yourself and/or make your canister very busy in the future.


Suggestion:

I get using nat64 at the system level, but for useability at the library level I’d recommend using a namespace as it helps make code more readable. You can hash it of course under the covers and just send around the nat64 (although a nat would be a bit more collision-resistant..see use case above)

I mean that if you use a randomly generated u64 as event_id, there will never be a matching notification.

A timeout of zero will likely run in the next round, yes, but I’d have to check to be sure. Initially, I was going to say that it may run in the same round, as part of the next scheduler iteration (the scheduler will route subnet-local messages from output queues into input queues and start another execution ā€œroundā€ within the same consensus round). But messages and callbacks are only timed out at the end of the consensus round, so the resulting SYS_UNKNOWN reject will be executed in the next consensus round, at the earliest.

Now as for whether it is the very next consensus round or not, that depends on whether your canister gets scheduled (if hot, it’s still ā€œthe next round when the canister got scheduledā€) and on how large of a backlog your canister has (it may not get to it if there’s too much other work to do).

If I understand you correctly, you’re saying there’s a race condition here. If so, I agree. If anything, this seems to me to be the dealbreaker: I make a call to check for something; it hasn’t yet happened; I then call observe(); if the event triggered between my first call and my observe() call, I’ll never get notified (and most repeat the whole process from the top after up to 5 minutes have elapsed).

I’m not actually sure about Motoko, but in Rust if you don’t await a future, it doesn’t do anything, it’s just a struct that is dropped when it goes out of context.

Not necessarily. Since you can deactivate the timer if there are no more pending calls to await.

That could be achieved by setting the timeout of a bounded-wait calls appropriately and producing no notification (as Alin pointed out).

Cost of canister calls do not depend on the callee’s subnet.

You could observe at most 500 events of canisters on a single subnet (this is the management canister’s queue capacity and there’s one such queue per subnet).

A ā€œpermanentā€ observation can be implemented by repeatedly subscribing. Is there an issue with doing that?

Isn’t this the goal of sleeping (that you’re blocked until the notification is performed or your defined timeout is reached)?

For this use case, you could just open a new call context (not blocking the current one)

The main motivation for nat64 was that it is a bounded type and thus one does not need to bound its size with a custom limit.

This is only relevant for (periodic) notifications with a fixed event ID (and buggy implementations of a one-off notification, namely those that do not notify atomically when the event happened). In the case of periodic notifications, the call to observe might indeed block until the next notification which might take a long time if you’re unlucky (start observing just after the last notification was made), but that seems unavoidable to me.

I will post more explanation and code later.

For us at Demergent Labs building Azle, something we have not yet been able to accomplish is an implementation of setTimer and setTimerInterval with close to the true semantics of those functions in the major JavaScript implementations.

I’ll post a very specific example later of code that does not work now, but I hope we can get to work with this proposal or another.

It’s not only relevant to periodic notifications. The whole point of the API as I see it is to have a fixed event ID (whether it’s an actual constant; or just included in the response to a ā€œdid this happenā€ type of query).

The problem, as I see it, is that for any event that occurs on another canister there is a time window (potentially spanning seconds) between learning that the event must be waited for and actually starting to wait for the event. A time window that is a source of race conditions. one issue with that is that one may miss the event altogether (particularly if it’s not a recurring event). Another issue is that one must be careful to always query for the status of the event before waiting for it, including inside a retry loop dealing with timeouts.

Not necessarily. On the one hand, for recurring notifications one could subscribe (and unsubscribe) for repeated notifications (although that raises the question of who pays for sending the arbitrarily many notifications per subscription; maybe the caller just attaches enough cycles for N notifications). OTOH, one could also imagine an API where canister A asks canister B (directly) about some event and canister B responds negatively but attaches an event ID; assuming message ordering (which only holds for requests, not responses/notifications) this would be a race-condition free solution.

More realistically (and perhaps more efficiently) one could have some sort of deferred responses: canister A calls canister B (directly); canister B cannot yet produce a response (event has not occurred), so instead of producing some sort of negative response, it calls a system API to end but temporarily stash the call context; when the event does eventually occur, a response is produced and the call context dropped. This could save on up to 2 extra roundtrips: one to query for the event before calling observer(); and one to retrieve any associated data after receiving the notification.

Just thinking out loud here, so take the above as such.