Canister call with long execution time

I’m developing a machine learning (ML) method that can take up to 10s of minutes to complete. However, canister calls can only run for few seconds due to limitations of the consensus algorithm. To overcome this limit I want to develop 2 methods: 1) start_process, 2) continue_process. The first method start_process starts the execution of the ML method and returns and id. The second method would be used to request the excution of the ML method for an existing id already started, it would run this process for a few seconds and return true or false to let know the client (another canister or js/ts client) if the ML process ended.
I have some questions about this approach:

  1. Do you think this is a good apporach to overcome the time limitation imposed by the consensus algorithm? Are there better ways to do that?
  2. How can I programmatically know (in the canister side) that I am approaching the limit time before the consensus round starts so the canister call can save the work and return?
1 Like

For point 2.
The instruction limit for an update call is up to 20B, see the docs
To know the instruction limit inside a canister call, maybe this API can help

3 Likes

How can I access this API in Motoko?

If you have a for loop or something similar that covers most of the work you can maybe pull the work inside the loop into its own async function and then you can loop over the self-await. This should then start fresh messages for every iteration, resetting the execution counter every time

1 Like

Pipelinify.mo has a framework for stepwise processing:

It provides function hooks to handle steps along the way and manages workspaces internally so that data remains available to your process across rounds:

    //pipeline types
    public type PipelinifyIntitialization = {
        onDataWillBeLoaded: ?((Hash, ?ProcessRequest) -> PipelineEventResponse);
        onDataReady: ?((Hash, Workspace, ?ProcessRequest) -> PipelineEventResponse);
        onPreProcess: ?((Hash, Workspace, ?ProcessRequest, ?Nat) -> PipelineEventResponse);
        onProcess: ?((Hash, Workspace, ?ProcessRequest, ?Nat) -> PipelineEventResponse);
        onPostProcess: ?((Hash, Workspace, ?ProcessRequest, ?Nat) -> PipelineEventResponse);
        onDataWillBeReturned: ?((Hash, Workspace,?ProcessRequest) -> PipelineEventResponse);
        onDataReturned: ?((Hash, ?ProcessRequest, ?ProcessResponse) -> PipelineEventResponse);
        getProcessType: ?((Hash, Workspace, ?ProcessRequest) -> ProcessType);
        getLocalWorkspace: ?((Hash, Nat, ?ProcessRequest) -> Candy.Workspace);
        putLocalWorkspace: ?((Hash, Nat, Candy.Workspace, ?ProcessRequest) -> Candy.Workspace);

    };

I haven’t added timers to auto-advance, but it is definitely on the todo list. It should not be hard to add.

3 Likes

This could be it but I don’t quiet understand the sentence:

Could you please give an example?

I am not sure this is what I need. Is there some simple examples documentation I can use to learn when and how to use this package?

Say one message is long enough to produce one output token of an LLM, but you want to produce a longer response. Factor out producing one token into a separate async function:

async fn one_token() -> Token {
  model.do_something()
}

Because it’s async, every call of this function will run in a separate message. Then you can produce a long response like this:

async fn long_answer() -> LongOutput {
  let output = empty()
  while not_fully_responded() {
    output <- output + one_token().await
  }
  return output
}

Ok, I understand this far. Now let’s add a new “requirement”: I want methods one_token() and long_answer() to be inside a package/module so it can be reused. Module methods cannot be declared async and await cannot be used either. Is there a way a pattern were I could still write one_token and long_answer as module functions?

I don’t know of a way how to split up functions you can’t modify from the outside except for writing an entire runtime system, which is a little bit extreme

I was thinking a possible way is to dynamically create a canister called long_answer" with 1 call method containing the main loop. This main loop could call module functions like one token. So now my machine learning library would consist of modules that contain code that can be safely executed monlitically and canister code that can be instantiated to perform long computations.
Do you think this makes sense?

Use a class…it can certainly have an await*. I actually think a module can as well.

I think I am not explaining well and/or not understabding well your suggestions. Maybe the following code helps:
Let’s assume I have this module, where f1() can be safely executed within 1 consensus round and f2() that calls f1 many times and generate a vector. f2() needs much more time than the consensus round:

module {
    public func f1(i_:Nat) : async (Nat) {  
      return i_;
    };

    public func f2() : async ([Nat]) {  
      let ret = Buffer.Buffer<Nat>(10);
      for (i in Iter.range(0, 10)) { 
        let f1_async_nat: async Nat = f1(i);
        let f1_nat : Nat = await f1_async_nat;
        ret.add(f1_nat);
      };
      return Buffer.toArray(ret);
    };
};

Now I have also a canister with one call method compute() that calls f2() and store the result in a canister variable:

actor {
  var mydata: [Nat] = [0];
  public func compute() : async () {
    mydata := mymodule.f2();
  };
};

My problem is that compute will return before f2() ends. How can I make this work? That is why I suggested initially to develop 2 canister methods: 1) start_process , 2) continue_process .

Doesn’t this just create a future that you then store in the vec? AFAIU you’re supposed to await the future so that you get the result, just like you’re doing it in f2

Should I do instead:

mydata := await mymodule.f2();

Would that be enough even if canister compute() method returns before f2 is over? Does it make sense my question?

This would make compute() wait until f2() is done and will collect the result. It will take a while because f2() performs multiple calls to f1(), which in turn may take quite a while

1 Like

Is it possible to call other canister methods while canister compute method is still running?

From the outside yes, and message execution order can be anything. From the inside, AFAIK the Rust CDK does not support multiple concurrent awaits

1 Like