Canister Output Message Queue Limits, and IC Management Canister Throttling Limits

bitbruce · November 7, 2022, 2:29pm

I think successive nested SELF calls can be optimised into one call.

call:fun5() -> call:fun4() -> call:fun3() -> call:fun2() -> call:fun1() -> out-call: ledger.transfer() ... 
return:fun5() <- return:fun4() <- return:fun3() <- return:fun2() <- return:fun1() <- out-return: ledger.transfer() <-

It can be optimised as below in the queue.

call:funs() -> out-call: ledger.transfer() ... 
return:funs() <- out-return: ledger.transfer() <-

In fact, the above 5 functions could be written as one function. But when coding, there is a need for code refactoring and reuse.

derlerd-dfinity1 · November 7, 2022, 2:50pm

Yes, this point is what @claudio will provide more details on. This is what I aimed to describe in the bullet point “Investigate whether …”.

What I meant to describe with the bullet point on reservations for responses in the snippet you are citing in your previous message is something that could be improved on the protocol level thats unrelated to how these things are handled in Motoko: roughly speaking, the protocol currently only allows DEFAULT_QUEUE_CAPACITY/2 requests in flight to self while there can be DEFAULT_QUEUE_CAPACITY requests in flight to other canisters. This is because the protocol doesn’t distinguish between local and remote canisters; distinguishing between them could provide 2x more space for messages to self.

claudio · November 7, 2022, 4:47pm

I will write a longer response when I get a chance, but, for now, to avoid the overhead of async/await associated with local functions that need to send messages, you need to remove those functions and inline them into their call sites.

I agree this is not good and have even proposed and implemented solutions to this problem in the past but they were felt to be too risky, blurring the distinction between await and state commit points.

I’ll elaborate on this in another reply, but fully agree that the current situation is not good enough for code-reuse and abstraction.

I’m happy to revisit addressing this, but there is no quick fix beyond inlining the calls to avoid the redundant async/await.

bitbruce · November 8, 2022, 4:45am

This is important. motoko is not a toy, not just for writing demos. motoko needs to meet the needs of engineering.
Its risks can be improved by good programming habits, good IDE tools.

Calls to functions of smart contracts in EVM are also divided into internal and external calls.

claudio · November 8, 2022, 8:56am

FTR, Support direct abstraction of code that awaits into functions, without requiring an unnecessary async · Issue #1482 · dfinity/motoko · GitHub is the original issue that discussed this along with links to the PR’s that fixed it that were then deemed to risky.

bitbruce · November 8, 2022, 12:45pm

Yes
The introduction of new semantic expressions is a good solution. For example inner await, inner async.
inner await is not a data commit point, but it may have an await data commit point inside it.

abc · November 9, 2022, 9:09am

Indeed, FTX Storms, the centralization is facing more and more challenges in the foreseeable future and we need to be prepared for these users who are moving to decentralization!
Looking forward to detailed reports and what preparations and changes we need to make(as soon as possible!) to support tens of millions(even more,yes even more) of users!

bitbruce · November 10, 2022, 2:11pm

“Canister trapped explicitly: could not perform self call" error caused by input/output message queue limitation cannot be caught by try-catch now.

This can break data consistency.

For example

private stable var n: Nat = 0;
private func fun() : async (){
  try{
      n += 1;  // Here it has been executed
      let res = await fun1(); // trapped by input/output message queue limitation
      // ...  // Here the code will not be executed
  }catch(e){
      n -= 1;     // Here the code will not be executed
  };
};

icme · November 10, 2022, 6:13pm

This is expected behavior - if the error comes at runtime from the canister itself then (in this case the canister is overflowing it’s own output queue), then the error can’t be caught and traps. Until nested async function calls are optimized or the canister output queue limit is raised, I would recommend putting in guardrails in your code to protect you from these runtime trap situations.

If the error comes as the awaited response from an async call, then this can be caught. Likewise, errors explicitly thrown by your code in a canister can be caught.

claudio · November 10, 2022, 11:22pm

We discussed the async abstraction issue in our team meeting today and will prioritize finding a solution for this soon.

But it won’t be overnight, I’m afraid, so you’ll need a workaround for now.

One is to avoid using asynchronous local functions by inlining their bodies into the call-sites, without the awaits.

If you don’t like inlinng, another solution that might work is to write non-async functions returning a value describing the call you want to make (a tuple of shared function and arguments), and getting the outermost function to perform the actual call with a single await, by applying the function from the tuple to the argument in the tuple and awaiting that.

bitbruce · November 10, 2022, 11:38pm

Now our temporary solution is to actively control the number of input/output messages. This control is not very reliable because it is difficult to get the exact size of the message queue.

derlerd-dfinity1 · November 13, 2022, 9:32am

For each queue to another canister there are 500 slots available, whereas there are (currently) 250 slots for the queue of a canister to itself. So if you can somehow make sure that a call only happens if there is still enough space w.r.t. these limits and currently outstanding calls things should work reliably, no?

To provide some background: pushing a message onto an output queue will make a reservation on the respective input queue. This reservation will only be “cleared” as soon as the reply arrives. So once the reply arrived one can be sure that the message is no longer “in flight”. So the counter of outstanding messages could be checked and incremented if there is still space before making a call, and decremented again once the reply arrives.

Sidenote 1: I’m not an expert in Motoko, so maybe @claudio can comment whether there are any concrete limitations in Motoko that would prevent realizing such a reliable counter of outstanding calls, or with recommendations how something like this can be implemented?

Sidenote 2: As described above, resolving the limitation that queues to self can only handle half of the messages compared to queues to other canisters is something we will prioritize. Note that it will also take some time to implement, though.

bitbruce · November 15, 2022, 2:20am

We are writing a counter and testing it

private func aaa() : async (){
  try{
     counter += 1;
     await bbb();
     counter -= 1;
  }catch(e){ counter -= 1; };
};

It works fine at low TPS, the counter counts normally during execution and returns to 0 when there is no access.

However, when a 10TPS stress test was performed, there was congestion when there were about 200+ accesses, and new accesses could be blocked as judged by the counter. But the internal execution seems to stop and the counter value stays at 486. With no new accesses, it has been more than 1 hour and still no longer changes.

claudio · November 15, 2022, 1:22pm

@derlerd-dfinity1 asked me to look at your code and suggested the problem is that bbb() will trap when the queue is full (edited) and never decrement the counter.

He suggested something along the lines of:

private func aaa() : async (){
    if (count >= limit) throw Error.reject("full"); 
    try {
      count += 1;
      await bbb();
      count -= 1;
    } catch (e) { 
      count -= 1; 
      throw e;
    };
  };

Notice that this tests for capacity before issuing the call, avoiding the trap.

I’ve played around a bit with testing this but I have to admit that counting calls is extremely error prone and (ideally) not something we should be expecting our users to do.

See here for what its worth:

Unfortunately, I think the arbitrary queue limits are a leaky abstraction that makes programming very hard.

PaulLiu · November 18, 2022, 1:25pm

I suppose you mean “when queue is full”?

However, I think when await bbb() traps, since it is treated as a synchronous error, the execution of function aaa() will roll back and counter shouldn’t be incremented.

So what @bitbruce wrote above looks correct code to me. No?

claudio · November 19, 2022, 1:32am

Right on both counts but he still needs to test counter to avoid the trap in the first place.

Iceypee · November 29, 2022, 4:54am

So is the main issue being discussed if you chain more than 100 some ignore’s() of any asynchronous call to another canister? I’m trying to understand whats the issue from a motoko only perspective, if you will.

Now I also noticed you mentioned Language Support - Internet Computer Developer Forum this post where you describe some kind of parallel execution which I havent fully wrapped my head around tbh. But essentially I do see that the stable buffer add is calling an async a bunch of times without having to await anything. Is under the hood, this function just ignoring instead of awaiting each f(as[i]), and at the end of the day and its the same problem? Or is this something completely different? Or is this not even where the error is in that code sample?

timo · November 29, 2022, 3:24pm

I think @claudio meant the code inside bbb() traps. Shouldn’t then await bbb() cause an error with reject code 5, that get caught and inside the catch branch the counter gets decremented again? So we don’t roll back. The counter value is correct, just for a different reason. So the code from @bitbruce still looks correct to me. If it does not work then I would like to understand why it doesn’t work.

Iceypee · November 29, 2022, 7:02pm

Actually, wait when @PaulLiu says

Does he mean

bitbruce:

private func aaa() : async (){
  try{
     counter += 1; //step 1 goes through //step 3 reverts this since looks at await bbb() as syncrhonous
     await bbb(); //step 2 fails
     counter -= 1;
  }catch(e){ counter -= 1; };
};
//step 4 never hits the catch(e) and finsihes

or is he saying

private func aaa() : async (){
  try{
     counter += 1; //step 1 goes through //step 3 revert since  await bbb() looked at synchronously
     await bbb(); //step 2 fails
     counter -= 1;
  }catch(e){ counter -= 1; //step4 goes here };
};
//negatively icnrements

Meanwhile youre saying

bitbruce:

private func aaa() : async (){
  try{
     counter += 1; // goes through and is recorded
     await bbb(); //some failure 
     counter -= 1; //skip this line
  }catch(e){ counter -= 1;  //this goes through};
};

I think the last one is correct though.

claudio · November 30, 2022, 12:02am

No I actually meant that the call to bbb() traps in aaa()before ever entering bbb() (because the queue between the actor and the destination of bbb is full). Then aaa() traps and all its effects, including the increment of counter, are rolled back.

We could make the trap (before entering bbb()) produce a local exception in aaa(), transferring control to the catch clause, but the current implementation does not do that.

I’m about to investigate changing that.

Topic		Replies	Views
Understanding Canisters Live on the IC Developers	2	59	January 3, 2025
[Discussion] Approaches for preventing canisters from hitting memory limits Developers Discussing	5	609	October 5, 2022
Signature queue for key ecdsa:Secp256k1:key_1 is full Developers	16	246	August 2, 2024
Can some one do a YouTube video on optimizing cross canister calls? Developers	3	565	February 2, 2023
Frozen Canister Developers	5	83	September 10, 2024

Canister Output Message Queue Limits, and IC Management Canister Throttling Limits

Related topics