I’m developing a machine learning (ML) method that can take up to 10s of minutes to complete. However, canister calls can only run for few seconds due to limitations of the consensus algorithm. To overcome this limit I want to develop 2 methods: 1) start_process, 2) continue_process. The first method start_process starts the execution of the ML method and returns and id. The second method would be used to request the excution of the ML method for an existing id already started, it would run this process for a few seconds and return true or false to let know the client (another canister or js/ts client) if the ML process ended.
I have some questions about this approach:
Do you think this is a good apporach to overcome the time limitation imposed by the consensus algorithm? Are there better ways to do that?
How can I programmatically know (in the canister side) that I am approaching the limit time before the consensus round starts so the canister call can save the work and return?
For point 2.
The instruction limit for an update call is up to 20B, see the docs
To know the instruction limit inside a canister call, maybe this API can help
If you have a for loop or something similar that covers most of the work you can maybe pull the work inside the loop into its own async function and then you can loop over the self-await. This should then start fresh messages for every iteration, resetting the execution counter every time
Pipelinify.mo has a framework for stepwise processing:
It provides function hooks to handle steps along the way and manages workspaces internally so that data remains available to your process across rounds:
Say one message is long enough to produce one output token of an LLM, but you want to produce a longer response. Factor out producing one token into a separate async function:
Ok, I understand this far. Now let’s add a new “requirement”: I want methods one_token() and long_answer() to be inside a package/module so it can be reused. Module methods cannot be declared async and await cannot be used either. Is there a way a pattern were I could still write one_token and long_answer as module functions?
I don’t know of a way how to split up functions you can’t modify from the outside except for writing an entire runtime system, which is a little bit extreme
I was thinking a possible way is to dynamically create a canister called long_answer" with 1 call method containing the main loop. This main loop could call module functions like one token. So now my machine learning library would consist of modules that contain code that can be safely executed monlitically and canister code that can be instantiated to perform long computations.
Do you think this makes sense?
I think I am not explaining well and/or not understabding well your suggestions. Maybe the following code helps:
Let’s assume I have this module, where f1() can be safely executed within 1 consensus round and f2() that calls f1 many times and generate a vector. f2() needs much more time than the consensus round:
module {
public func f1(i_:Nat) : async (Nat) {
return i_;
};
public func f2() : async ([Nat]) {
let ret = Buffer.Buffer<Nat>(10);
for (i in Iter.range(0, 10)) {
let f1_async_nat: async Nat = f1(i);
let f1_nat : Nat = await f1_async_nat;
ret.add(f1_nat);
};
return Buffer.toArray(ret);
};
};
Now I have also a canister with one call method compute() that calls f2() and store the result in a canister variable:
actor {
var mydata: [Nat] = [0];
public func compute() : async () {
mydata := mymodule.f2();
};
};
My problem is that compute will return before f2() ends. How can I make this work? That is why I suggested initially to develop 2 canister methods: 1) start_process , 2) continue_process .
Doesn’t this just create a future that you then store in the vec? AFAIU you’re supposed to await the future so that you get the result, just like you’re doing it in f2
This would make compute() wait until f2() is done and will collect the result. It will take a while because f2() performs multiple calls to f1(), which in turn may take quite a while